Related papers: Active Assessment of Prediction Services as Accuracy Surface Over Attribute Combinations

Active Assessment of Prediction Services as Accuracy Surface Over Attribute Combinations

URL: http://arxiv.org/abs/2108.06514v1
Date: Sat, 14 Aug 2021 10:59:14 GMT
Title: Active Assessment of Prediction Services as Accuracy Surface Over Attribute Combinations
Authors: Vihari Piratla, Soumen Chakrabarty, Sunita Sarawagi
Abstract summary: Attributed Accuracy Assay (AAA) is a probabilistic estimator for such an accuracy surface. We show that GP cannot address the challenge of heteroscedastic uncertainty over a huge attribute space. We present two enhancements: pooling sparse observations, and regularizing the scale parameter of the Beta densities.
Score: 22.18147577177574
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Our goal is to evaluate the accuracy of a black-box classification model, not as a single aggregate on a given test data distribution, but as a surface over a large number of combinations of attributes characterizing multiple test data distributions. Such attributed accuracy measures become important as machine learning models get deployed as a service, where the training data distribution is hidden from clients, and different clients may be interested in diverse regions of the data distribution. We present Attributed Accuracy Assay (AAA)--a Gaussian Process (GP)--based probabilistic estimator for such an accuracy surface. Each attribute combination, called an 'arm', is associated with a Beta density from which the service's accuracy is sampled. We expect the GP to smooth the parameters of the Beta density over related arms to mitigate sparsity. We show that obvious application of GPs cannot address the challenge of heteroscedastic uncertainty over a huge attribute space that is sparsely and unevenly populated. In response, we present two enhancements: pooling sparse observations, and regularizing the scale parameter of the Beta densities. After introducing these innovations, we establish the effectiveness of AAA in terms of both its estimation accuracy and exploration efficiency, through extensive experiments and analysis.

Related papers

Optimal Algorithms for Augmented Testing of Discrete Distributions [25.818433126197036]
We show that a predictor can indeed reduce the number of samples required for all three property testing tasks. A key advantage of our algorithms is their adaptability to the precision of the prediction. We provide lower bounds to indicate that the improvements in sample complexity achieved by our algorithms are information-theoretically optimal.
arXiv Detail & Related papers (2024-12-01T21:31:22Z)
Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables. We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure. We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z)
Investigating the Impact of Hard Samples on Accuracy Reveals In-class Data Imbalance [4.291589126905706]
In the AutoML domain, test accuracy is heralded as the quintessential metric for evaluating model efficacy. However, the reliability of test accuracy as the primary performance metric has been called into question. The distribution of hard samples between training and test sets affects the difficulty levels of those sets. We propose a benchmarking procedure for comparing hard sample identification methods.
arXiv Detail & Related papers (2024-09-22T11:38:14Z)
Predictive Accuracy-Based Active Learning for Medical Image Segmentation [5.25147264940975]
We propose an efficient Predictive Accuracy-based Active Learning (PAAL) method for medical image segmentation. PAAL consists of an Accuracy Predictor (AP) and a Weighted Polling Strategy (WPS) Experiment results on multiple datasets demonstrate the superiority of PAAL.
arXiv Detail & Related papers (2024-05-01T11:12:08Z)
A Targeted Accuracy Diagnostic for Variational Approximations [8.969208467611896]
Variational Inference (VI) is an attractive alternative to Markov Chain Monte Carlo (MCMC) Existing methods characterize the quality of the whole variational distribution. We propose the TArgeted Diagnostic for Distribution Approximation Accuracy (TADDAA)
arXiv Detail & Related papers (2023-02-24T02:50:18Z)
Bayes Classification using an approximation to the Joint Probability Distribution of the Attributes [1.0660480034605242]
We propose an approach that estimates conditional probabilities using information in the neighbourhood of the test sample. We illustrate the performance of the proposed approach on a wide range of datasets taken from the University of California at Irvine (UCI) Machine Learning Repository.
arXiv Detail & Related papers (2022-05-29T22:24:02Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators. They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions. We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z)
Data Dependent Randomized Smoothing [127.34833801660233]
We show that our data dependent framework can be seamlessly incorporated into 3 randomized smoothing approaches. We get 9% and 6% improvement over the certified accuracy of the strongest baseline for a radius of 0.5 on CIFAR10 and ImageNet.
arXiv Detail & Related papers (2020-12-08T10:53:11Z)
AutoAssign: Differentiable Label Assignment for Dense Object Detection [94.24431503373884]
Auto COCO is an anchor-free detector for object detection. It achieves appearance-aware through a fully differentiable weighting mechanism. Our best model achieves 52.1% AP, outperforming all existing one-stage detectors.
arXiv Detail & Related papers (2020-07-07T14:32:21Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.