Bayesian Quantification with Black-Box Estimators
- URL: http://arxiv.org/abs/2302.09159v1
- Date: Fri, 17 Feb 2023 22:10:04 GMT
- Title: Bayesian Quantification with Black-Box Estimators
- Authors: Albert Ziegler, Pawe{\l} Czy\.z
- Abstract summary: Approaches like adjusted classify and count, black-box shift estimators, and invariant ratio estimators use an auxiliary (and potentially biased) black-box classifier to estimate the class distribution and yield guarantees under weak assumptions.
We demonstrate that all these algorithms are closely related to the inference in a particular Bayesian Chain model, approxing the assumed ground-truthgenerative process.
Then, we discuss an efficient Markov Monte Carlo sampling scheme for the introduced model and show an consistency guarantee in the large-data limit.
- Score: 1.599072005190786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding how different classes are distributed in an unlabeled data set
is an important challenge for the calibration of probabilistic classifiers and
uncertainty quantification. Approaches like adjusted classify and count,
black-box shift estimators, and invariant ratio estimators use an auxiliary
(and potentially biased) black-box classifier trained on a different (shifted)
data set to estimate the class distribution and yield asymptotic guarantees
under weak assumptions. We demonstrate that all these algorithms are closely
related to the inference in a particular Bayesian model, approximating the
assumed ground-truth generative process. Then, we discuss an efficient Markov
Chain Monte Carlo sampling scheme for the introduced model and show an
asymptotic consistency guarantee in the large-data limit. We compare the
introduced model against the established point estimators in a variety of
scenarios, and show it is competitive, and in some cases superior, with the
state of the art.
Related papers
- CLIMAX: An exploration of Classifier-Based Contrastive Explanations [5.381004207943597]
We propose a novel post-hoc model XAI technique that provides contrastive explanations justifying the classification of a black box.
Our method, which we refer to as CLIMAX, is based on local classifiers.
We show that we achieve better consistency as compared to baselines such as LIME, BayLIME, and SLIME.
arXiv Detail & Related papers (2023-07-02T22:52:58Z) - Towards Better Certified Segmentation via Diffusion Models [62.21617614504225]
segmentation models can be vulnerable to adversarial perturbations, which hinders their use in critical-decision systems like healthcare or autonomous driving.
Recently, randomized smoothing has been proposed to certify segmentation predictions by adding Gaussian noise to the input to obtain theoretical guarantees.
In this paper, we address the problem of certifying segmentation prediction using a combination of randomized smoothing and diffusion models.
arXiv Detail & Related papers (2023-06-16T16:30:39Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - Regularized Vector Quantization for Tokenized Image Synthesis [126.96880843754066]
Quantizing images into discrete representations has been a fundamental problem in unified generative modeling.
deterministic quantization suffers from severe codebook collapse and misalignment with inference stage while quantization suffers from low codebook utilization and reconstruction objective.
This paper presents a regularized vector quantization framework that allows to mitigate perturbed above issues effectively by applying regularization from two perspectives.
arXiv Detail & Related papers (2023-03-11T15:20:54Z) - Bayesian Hierarchical Models for Counterfactual Estimation [12.159830463756341]
We propose a probabilistic paradigm to estimate a diverse set of counterfactuals.
We treat the perturbations as random variables endowed with prior distribution functions.
A gradient based sampler with superior convergence characteristics efficiently computes the posterior samples.
arXiv Detail & Related papers (2023-01-21T00:21:11Z) - Simplex Clustering via sBeta with Applications to Online Adjustment of Black-Box Predictions [16.876111500144667]
We introduce a novel probabilistic clustering method, referred to as k-sBetas.
We provide a general maximum a posteriori (MAP) perspective of clustering distributions.
Our code and comparisons with the existing simplex-clustering approaches and our introduced softmax-prediction benchmarks are publicly available.
arXiv Detail & Related papers (2022-07-30T18:29:11Z) - Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm.
Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - A generalized Bayes framework for probabilistic clustering [3.3194866396158]
Loss-based clustering methods, such as k-means and its variants, are standard tools for finding groups in data.
Model-based clustering based on mixture models provides an alternative, but such methods face computational problems and large sensitivity to the choice of kernel.
This article proposes a generalized Bayes framework that bridges between these two paradigms through the use of Gibbs posteriors.
arXiv Detail & Related papers (2020-06-09T18:49:32Z) - Robust M-Estimation Based Bayesian Cluster Enumeration for Real
Elliptically Symmetric Distributions [5.137336092866906]
Robustly determining optimal number of clusters in a data set is an essential factor in a wide range of applications.
This article generalizes so that it can be used with any arbitrary Really Symmetric (RES) distributed mixture model.
We derive a robust criterion for data sets with finite sample size, and also provide an approximation to reduce the computational cost at large sample sizes.
arXiv Detail & Related papers (2020-05-04T11:44:49Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.