A Skew-Sensitive Evaluation Framework for Imbalanced Data Classification
- URL: http://arxiv.org/abs/2010.05995v2
- Date: Thu, 16 Nov 2023 21:08:05 GMT
- Title: A Skew-Sensitive Evaluation Framework for Imbalanced Data Classification
- Authors: Min Du, Nesime Tatbul, Brian Rivers, Akhilesh Kumar Gupta, Lucas Hu,
Wei Wang, Ryan Marcus, Shengtian Zhou, Insup Lee, Justin Gottschlich
- Abstract summary: Class distribution skews in imbalanced datasets may lead to models with prediction bias towards majority classes.
We propose a simple and general-purpose evaluation framework for imbalanced data classification that is sensitive to arbitrary skews in class cardinalities and importances.
- Score: 11.125446871030734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Class distribution skews in imbalanced datasets may lead to models with
prediction bias towards majority classes, making fair assessment of classifiers
a challenging task. Metrics such as Balanced Accuracy are commonly used to
evaluate a classifier's prediction performance under such scenarios. However,
these metrics fall short when classes vary in importance. In this paper, we
propose a simple and general-purpose evaluation framework for imbalanced data
classification that is sensitive to arbitrary skews in class cardinalities and
importances. Experiments with several state-of-the-art classifiers tested on
real-world datasets from three different domains show the effectiveness of our
framework - not only in evaluating and ranking classifiers, but also training
them.
Related papers
- Improving the classification of extreme classes by means of loss regularisation and generalised beta distributions [8.640930010669042]
We propose a unimodal regularisation approach to improve the classification performance of the first and last classes.
Performance in the extreme classes is compared using a new metric that takes into account their sensitivities.
The results for the proposed metric show that the generalised beta distribution generally improves classification performance in the extreme classes.
arXiv Detail & Related papers (2024-07-17T08:57:42Z) - Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes.
The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range.
We propose to construct hierarchical classifiers for solving imbalanced regression tasks.
Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z) - Revisiting Long-tailed Image Classification: Survey and Benchmarks with
New Evaluation Metrics [88.39382177059747]
A corpus of metrics is designed for measuring the accuracy, robustness, and bounds of algorithms for learning with long-tailed distribution.
Based on our benchmarks, we re-evaluate the performance of existing methods on CIFAR10 and CIFAR100 datasets.
arXiv Detail & Related papers (2023-02-03T02:40:54Z) - Systematic Evaluation of Predictive Fairness [60.0947291284978]
Mitigating bias in training on biased datasets is an important open problem.
We examine the performance of various debiasing methods across multiple tasks.
We find that data conditions have a strong influence on relative model performance.
arXiv Detail & Related papers (2022-10-17T05:40:13Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - Statistical Theory for Imbalanced Binary Classification [8.93993657323783]
We show that optimal classification performance depends on certain properties of class imbalance that have not previously been formalized.
Specifically, we propose a novel sub-type of class imbalance, which we call Uniform Class Imbalance.
These results provide some of the first meaningful finite-sample statistical theory for imbalanced binary classification.
arXiv Detail & Related papers (2021-07-05T03:55:43Z) - Predicting Classification Accuracy When Adding New Unobserved Classes [8.325327265120283]
We study how a classifier's performance can be used to extrapolate its expected accuracy on a larger, unobserved set of classes.
We formulate a robust neural-network-based algorithm, "CleaneX", which learns to estimate the accuracy of such classifiers on arbitrarily large sets of classes.
arXiv Detail & Related papers (2020-10-28T14:37:25Z) - Classification Performance Metric for Imbalance Data Based on Recall and
Selectivity Normalized in Class Labels [0.0]
We introduce a new performance measure based on the harmonic mean of Recall and Selectivity normalized in class labels.
This paper shows that the proposed performance measure has the right properties for the imbalanced dataset.
arXiv Detail & Related papers (2020-06-23T20:38:48Z) - Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition
from a Domain Adaptation Perspective [98.70226503904402]
Object frequency in the real world often follows a power law, leading to a mismatch between datasets with long-tailed class distributions.
We propose to augment the classic class-balanced learning by explicitly estimating the differences between the class-conditioned distributions with a meta-learning approach.
arXiv Detail & Related papers (2020-03-24T11:28:42Z) - On Model Evaluation under Non-constant Class Imbalance [0.0]
Many real-world classification problems are significantly class-imbalanced to detriment of the class of interest.
The usual assumption is that the test dataset imbalance equals the real-world imbalance.
We introduce methods focusing on evaluation under non-constant class imbalance.
arXiv Detail & Related papers (2020-01-15T21:52:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.