Performance Metrics for Probabilistic Ordinal Classifiers
- URL: http://arxiv.org/abs/2309.08701v1
- Date: Fri, 15 Sep 2023 18:45:15 GMT
- Title: Performance Metrics for Probabilistic Ordinal Classifiers
- Authors: Adrian Galdran
- Abstract summary: Ordinal classification models assign higher penalties to predictions further away from the true class.
This paper advocates the use of the Ranked Probability Score (RPS) for image grading tasks.
- Score: 1.6653762541912462
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ordinal classification models assign higher penalties to predictions further
away from the true class. As a result, they are appropriate for relevant
diagnostic tasks like disease progression prediction or medical image grading.
The consensus for assessing their categorical predictions dictates the use of
distance-sensitive metrics like the Quadratic-Weighted Kappa score or the
Expected Cost. However, there has been little discussion regarding how to
measure performance of probabilistic predictions for ordinal classifiers. In
conventional classification, common measures for probabilistic predictions are
Proper Scoring Rules (PSR) like the Brier score, or Calibration Errors like the
ECE, yet these are not optimal choices for ordinal classification. A PSR named
Ranked Probability Score (RPS), widely popular in the forecasting field, is
more suitable for this task, but it has received no attention in the image
analysis community. This paper advocates the use of the RPS for image grading
tasks. In addition, we demonstrate a counter-intuitive and questionable
behavior of this score, and propose a simple fix for it. Comprehensive
experiments on four large-scale biomedical image grading problems over three
different datasets show that the RPS is a more suitable performance metric for
probabilistic ordinal predictions. Code to reproduce our experiments can be
found at https://github.com/agaldran/prob_ord_metrics .
Related papers
- Conformal Prediction Sets with Improved Conditional Coverage using Trust Scores [52.92618442300405]
It is impossible to achieve exact, distribution-free conditional coverage in finite samples.
We propose an alternative conformal prediction algorithm that targets coverage where it matters most.
arXiv Detail & Related papers (2025-01-17T12:01:56Z) - Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - Calibration of Ordinal Regression Networks [1.2242167538741824]
Deep neural networks are not well-calibrated and often produce over-confident predictions.
We propose a novel loss function that introduces ordinal-aware calibration.
It incorporates soft ordinal encoding and ordinal-aware regularization to enforce both calibration and unimodality.
arXiv Detail & Related papers (2024-10-21T05:56:31Z) - On Temperature Scaling and Conformal Prediction of Deep Classifiers [9.975341265604577]
Two popular approaches for that aim are: 1): modifies the classifier's softmax values such that the maximal value better estimates the correctness probability; and 2) Conformal Prediction (CP): produces a prediction set of candidate labels that contains the true label with a user-specified probability.
In practice, both types of indications are desirable, yet, so far the interplay between them has not been investigated.
arXiv Detail & Related papers (2024-02-08T16:45:12Z) - From Classification Accuracy to Proper Scoring Rules: Elicitability of
Probabilistic Top List Predictions [0.0]
I propose a novel type of prediction in classification, which bridges the gap between single-class predictions and predictive distributions.
The proposed evaluation metrics are based on symmetric proper scoring rules and admit comparison of various types of predictions.
arXiv Detail & Related papers (2023-01-27T15:55:01Z) - Evaluating Probabilistic Classifiers: The Triptych [62.997667081978825]
We propose and study a triptych of diagnostic graphics that focus on distinct and complementary aspects of forecast performance.
The reliability diagram addresses calibration, the receiver operating characteristic (ROC) curve diagnoses discrimination ability, and the Murphy diagram visualizes overall predictive performance and value.
arXiv Detail & Related papers (2023-01-25T19:35:23Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Optimizing Partial Area Under the Top-k Curve: Theory and Practice [151.5072746015253]
We develop a novel metric named partial Area Under the top-k Curve (AUTKC)
AUTKC has a better discrimination ability, and its Bayes optimal score function could give a correct top-K ranking with respect to the conditional probability.
We present an empirical surrogate risk minimization framework to optimize the proposed metric.
arXiv Detail & Related papers (2022-09-03T11:09:13Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - Training conformal predictors [0.0]
Efficiency criteria for conformal prediction, such as emphobserved fuzziness, are commonly used to emphevaluate the performance of given conformal predictors.
Here, we investigate whether it is possible to exploit such criteria to emphlearn classifiers.
arXiv Detail & Related papers (2020-05-14T14:47:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.