Related papers: Estimating Multi-label Accuracy using Labelset Distributions

Estimating Multi-label Accuracy using Labelset Distributions

URL: http://arxiv.org/abs/2209.04163v1
Date: Fri, 9 Sep 2022 07:47:35 GMT
Title: Estimating Multi-label Accuracy using Labelset Distributions
Authors: Laurence A. F. Park, Jesse Read
Abstract summary: A multi-label classifier estimates the binary label state for each of a set of concept labels, for any given instance. We show that the expected accuracy can be estimated from the multi-label predictive distribution.
Score: 1.5076964620370268
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A multi-label classifier estimates the binary label state (relevant vs irrelevant) for each of a set of concept labels, for any given instance. Probabilistic multi-label classifiers provide a predictive posterior distribution over all possible labelset combinations of such label states (the powerset of labels) from which we can provide the best estimate, simply by selecting the labelset corresponding to the largest expected accuracy, over that distribution. For example, in maximizing exact match accuracy, we provide the mode of the distribution. But how does this relate to the confidence we may have in such an estimate? Confidence is an important element of real-world applications of multi-label classifiers (as in machine learning in general) and is an important ingredient in explainability and interpretability. However, it is not obvious how to provide confidence in the multi-label context and relating to a particular accuracy metric, and nor is it clear how to provide a confidence which correlates well with the expected accuracy, which would be most valuable in real-world decision making. In this article we estimate the expected accuracy as a surrogate for confidence, for a given accuracy metric. We hypothesise that the expected accuracy can be estimated from the multi-label predictive distribution. We examine seven candidate functions for their ability to estimate expected accuracy from the predictive distribution. We found three of these to correlate to expected accuracy and are robust. Further, we determined that each candidate function can be used separately to estimate Hamming similarity, but a combination of the candidates was best for expected Jaccard index and exact match.

Related papers

Performance Estimation in Binary Classification Using Calibrated Confidence [0.5399800035598186]
We present CBPE, a novel method that can estimate any binary classification metric defined using the confusion matrix.<n>CBPE is shown to produce estimates that come with strong theoretical guarantees and valid confidence intervals.
arXiv Detail & Related papers (2025-05-08T14:34:44Z)
Conformal Prediction Sets with Improved Conditional Coverage using Trust Scores [52.92618442300405]
It is impossible to achieve exact, distribution-free conditional coverage in finite samples. We propose an alternative conformal prediction algorithm that targets coverage where it matters most.
arXiv Detail & Related papers (2025-01-17T12:01:56Z)
Label Distribution Learning using the Squared Neural Family on the Probability Simplex [15.680835401104247]
We estimate a probability distribution of all possible label distributions over the simplex. With the modeled distribution, label distribution prediction can be achieved by performing the expectation operation. More information about the label distribution can be inferred, such as the prediction reliability and uncertainties.
arXiv Detail & Related papers (2024-12-10T09:12:02Z)
Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning [53.42244686183879]
Conformal prediction provides model-agnostic and distribution-free uncertainty quantification. Yet, conformal prediction is not reliable under poisoning attacks where adversaries manipulate both training and calibration data. We propose reliable prediction sets (RPS): the first efficient method for constructing conformal prediction sets with provable reliability guarantees under poisoning.
arXiv Detail & Related papers (2024-10-13T15:37:11Z)
Predicting generalization performance with correctness discriminators [64.00420578048855]
We present a novel model that establishes upper and lower bounds on the accuracy, without requiring gold labels for the unseen data. We show across a variety of tagging, parsing, and semantic parsing tasks that the gold accuracy is reliably between the predicted upper and lower bounds.
arXiv Detail & Related papers (2023-11-15T22:43:42Z)
Leveraging Ensemble Diversity for Robust Self-Training in the Presence of Sample Selection Bias [5.698050337128548]
Self-training is a well-known approach for semi-supervised learning. It consists of iteratively assigning pseudo-labels to unlabeled data for which the model is confident and treating them as labeled examples. For neural networks, softmax prediction probabilities are often used as a confidence measure, although they are known to be overconfident, even for wrong predictions. We propose a novel confidence measure, called $mathcalT$-similarity, built upon the prediction diversity of an ensemble of linear classifiers.
arXiv Detail & Related papers (2023-10-23T11:30:06Z)
PAC Prediction Sets Under Label Shift [52.30074177997787]
Prediction sets capture uncertainty by predicting sets of labels rather than individual labels. We propose a novel algorithm for constructing prediction sets with PAC guarantees in the label shift setting. We evaluate our approach on five datasets.
arXiv Detail & Related papers (2023-10-19T17:57:57Z)
Confidence and Dispersity Speak: Characterising Prediction Matrix for Unsupervised Accuracy Estimation [51.809741427975105]
This work aims to assess how well a model performs under distribution shifts without using labels. We use the nuclear norm that has been shown to be effective in characterizing both properties. We show that the nuclear norm is more accurate and robust in accuracy than existing methods.
arXiv Detail & Related papers (2023-02-02T13:30:48Z)
From Classification Accuracy to Proper Scoring Rules: Elicitability of Probabilistic Top List Predictions [0.0]
I propose a novel type of prediction in classification, which bridges the gap between single-class predictions and predictive distributions. The proposed evaluation metrics are based on symmetric proper scoring rules and admit comparison of various types of predictions.
arXiv Detail & Related papers (2023-01-27T15:55:01Z)
Test-time Recalibration of Conformal Predictors Under Distribution Shift Based on Unlabeled Examples [30.61588337557343]
Conformal predictors provide uncertainty estimates by computing a set of classes with a user-specified probability. We propose a method that provides excellent uncertainty estimates under natural distribution shifts.
arXiv Detail & Related papers (2022-10-09T04:46:00Z)
Distribution-free uncertainty quantification for classification under label shift [105.27463615756733]
We focus on uncertainty quantification (UQ) for classification problems via two avenues. We first argue that label shift hurts UQ, by showing degradation in coverage and calibration. We examine these techniques theoretically in a distribution-free framework and demonstrate their excellent practical performance.
arXiv Detail & Related papers (2021-03-04T20:51:03Z)
Selective Classification Can Magnify Disparities Across Groups [89.14499988774985]
We find that while selective classification can improve average accuracies, it can simultaneously magnify existing accuracy disparities. Increasing abstentions can even decrease accuracies on some groups. We train distributionally-robust models that achieve similar full-coverage accuracies across groups and show that selective classification uniformly improves each group.
arXiv Detail & Related papers (2020-10-27T08:51:30Z)
Class-Similarity Based Label Smoothing for Confidence Calibration [2.055949720959582]
We propose a novel form of label smoothing to improve confidence calibration. Since different classes are of different intrinsic similarities, more similar classes should result in closer probability values in the final output. This motivates the development of a new smooth label where the label values are based on similarities with the reference class.
arXiv Detail & Related papers (2020-06-24T20:26:22Z)
Knowing what you know: valid and validated confidence sets in multiclass and multilabel prediction [0.8594140167290097]
We develop conformal prediction methods for constructing valid confidence sets in multiclass and multilabel problems. By leveraging ideas from quantile regression, we build methods that always guarantee correct coverage but additionally provide conditional coverage for both multiclass and multilabel prediction problems.
arXiv Detail & Related papers (2020-04-21T17:45:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.