Active Assessment of Prediction Services as Accuracy Surface Over
Attribute Combinations
- URL: http://arxiv.org/abs/2108.06514v1
- Date: Sat, 14 Aug 2021 10:59:14 GMT
- Title: Active Assessment of Prediction Services as Accuracy Surface Over
Attribute Combinations
- Authors: Vihari Piratla, Soumen Chakrabarty, Sunita Sarawagi
- Abstract summary: Attributed Accuracy Assay (AAA) is a probabilistic estimator for such an accuracy surface.
We show that GP cannot address the challenge of heteroscedastic uncertainty over a huge attribute space.
We present two enhancements: pooling sparse observations, and regularizing the scale parameter of the Beta densities.
- Score: 22.18147577177574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our goal is to evaluate the accuracy of a black-box classification model, not
as a single aggregate on a given test data distribution, but as a surface over
a large number of combinations of attributes characterizing multiple test data
distributions. Such attributed accuracy measures become important as machine
learning models get deployed as a service, where the training data distribution
is hidden from clients, and different clients may be interested in diverse
regions of the data distribution. We present Attributed Accuracy Assay (AAA)--a
Gaussian Process (GP)--based probabilistic estimator for such an accuracy
surface. Each attribute combination, called an 'arm', is associated with a Beta
density from which the service's accuracy is sampled. We expect the GP to
smooth the parameters of the Beta density over related arms to mitigate
sparsity. We show that obvious application of GPs cannot address the challenge
of heteroscedastic uncertainty over a huge attribute space that is sparsely and
unevenly populated. In response, we present two enhancements: pooling sparse
observations, and regularizing the scale parameter of the Beta densities. After
introducing these innovations, we establish the effectiveness of AAA in terms
of both its estimation accuracy and exploration efficiency, through extensive
experiments and analysis.
Related papers
- Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - Investigating the Impact of Hard Samples on Accuracy Reveals In-class Data Imbalance [4.291589126905706]
In the AutoML domain, test accuracy is heralded as the quintessential metric for evaluating model efficacy.
However, the reliability of test accuracy as the primary performance metric has been called into question.
The distribution of hard samples between training and test sets affects the difficulty levels of those sets.
We propose a benchmarking procedure for comparing hard sample identification methods.
arXiv Detail & Related papers (2024-09-22T11:38:14Z) - Predictive Accuracy-Based Active Learning for Medical Image Segmentation [5.25147264940975]
We propose an efficient Predictive Accuracy-based Active Learning (PAAL) method for medical image segmentation.
PAAL consists of an Accuracy Predictor (AP) and a Weighted Polling Strategy (WPS)
Experiment results on multiple datasets demonstrate the superiority of PAAL.
arXiv Detail & Related papers (2024-05-01T11:12:08Z) - A Targeted Accuracy Diagnostic for Variational Approximations [8.969208467611896]
Variational Inference (VI) is an attractive alternative to Markov Chain Monte Carlo (MCMC)
Existing methods characterize the quality of the whole variational distribution.
We propose the TArgeted Diagnostic for Distribution Approximation Accuracy (TADDAA)
arXiv Detail & Related papers (2023-02-24T02:50:18Z) - Bayes Classification using an approximation to the Joint Probability
Distribution of the Attributes [1.0660480034605242]
We propose an approach that estimates conditional probabilities using information in the neighbourhood of the test sample.
We illustrate the performance of the proposed approach on a wide range of datasets taken from the University of California at Irvine (UCI) Machine Learning Repository.
arXiv Detail & Related papers (2022-05-29T22:24:02Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Data Dependent Randomized Smoothing [127.34833801660233]
We show that our data dependent framework can be seamlessly incorporated into 3 randomized smoothing approaches.
We get 9% and 6% improvement over the certified accuracy of the strongest baseline for a radius of 0.5 on CIFAR10 and ImageNet.
arXiv Detail & Related papers (2020-12-08T10:53:11Z) - AutoAssign: Differentiable Label Assignment for Dense Object Detection [94.24431503373884]
Auto COCO is an anchor-free detector for object detection.
It achieves appearance-aware through a fully differentiable weighting mechanism.
Our best model achieves 52.1% AP, outperforming all existing one-stage detectors.
arXiv Detail & Related papers (2020-07-07T14:32:21Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.