On Model Evaluation under Non-constant Class Imbalance
- URL: http://arxiv.org/abs/2001.05571v2
- Date: Wed, 15 Apr 2020 17:58:21 GMT
- Title: On Model Evaluation under Non-constant Class Imbalance
- Authors: Jan Brabec, Tom\'a\v{s} Kom\'arek, Vojt\v{e}ch Franc, Luk\'a\v{s}
Machlica
- Abstract summary: Many real-world classification problems are significantly class-imbalanced to detriment of the class of interest.
The usual assumption is that the test dataset imbalance equals the real-world imbalance.
We introduce methods focusing on evaluation under non-constant class imbalance.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many real-world classification problems are significantly class-imbalanced to
detriment of the class of interest. The standard set of proper evaluation
metrics is well-known but the usual assumption is that the test dataset
imbalance equals the real-world imbalance. In practice, this assumption is
often broken for various reasons. The reported results are then often too
optimistic and may lead to wrong conclusions about industrial impact and
suitability of proposed techniques. We introduce methods focusing on evaluation
under non-constant class imbalance. We show that not only the absolute values
of commonly used metrics, but even the order of classifiers in relation to the
evaluation metric used is affected by the change of the imbalance rate.
Finally, we demonstrate that using subsampling in order to get a test dataset
with class imbalance equal to the one observed in the wild is not necessary,
and eventually can lead to significant errors in classifier's performance
estimate.
Related papers
- $F_β$-plot -- a visual tool for evaluating imbalanced data classifiers [0.0]
The paper proposes a simple approach to analyzing the popular parametric metric $F_beta$.
It is possible to indicate for a given pool of analyzed classifiers when a given model should be preferred depending on user requirements.
arXiv Detail & Related papers (2024-04-11T18:07:57Z) - Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes.
The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range.
We propose to construct hierarchical classifiers for solving imbalanced regression tasks.
Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z) - Revisiting Long-tailed Image Classification: Survey and Benchmarks with
New Evaluation Metrics [88.39382177059747]
A corpus of metrics is designed for measuring the accuracy, robustness, and bounds of algorithms for learning with long-tailed distribution.
Based on our benchmarks, we re-evaluate the performance of existing methods on CIFAR10 and CIFAR100 datasets.
arXiv Detail & Related papers (2023-02-03T02:40:54Z) - Systematic Evaluation of Predictive Fairness [60.0947291284978]
Mitigating bias in training on biased datasets is an important open problem.
We examine the performance of various debiasing methods across multiple tasks.
We find that data conditions have a strong influence on relative model performance.
arXiv Detail & Related papers (2022-10-17T05:40:13Z) - Class Is Invariant to Context and Vice Versa: On Learning Invariance for
Out-Of-Distribution Generalization [80.22736530704191]
We argue that the widely adopted assumption in prior work, the context bias can be directly annotated or estimated from biased class prediction.
In contrast, we point out the everoverlooked other side of the above principle: context is also invariant to class.
We implement this idea by minimizing the contrastive loss of intra-class sample similarity while assuring this similarity to be invariant across all classes.
arXiv Detail & Related papers (2022-08-06T08:09:54Z) - Learning to Adapt Classifier for Imbalanced Semi-supervised Learning [38.434729550279116]
Pseudo-labeling has proven to be a promising semi-supervised learning (SSL) paradigm.
Existing pseudo-labeling methods commonly assume that the class distributions of training data are balanced.
In this work, we investigate pseudo-labeling under imbalanced semi-supervised setups.
arXiv Detail & Related papers (2022-07-28T02:15:47Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Statistical Theory for Imbalanced Binary Classification [8.93993657323783]
We show that optimal classification performance depends on certain properties of class imbalance that have not previously been formalized.
Specifically, we propose a novel sub-type of class imbalance, which we call Uniform Class Imbalance.
These results provide some of the first meaningful finite-sample statistical theory for imbalanced binary classification.
arXiv Detail & Related papers (2021-07-05T03:55:43Z) - A Skew-Sensitive Evaluation Framework for Imbalanced Data Classification [11.125446871030734]
Class distribution skews in imbalanced datasets may lead to models with prediction bias towards majority classes.
We propose a simple and general-purpose evaluation framework for imbalanced data classification that is sensitive to arbitrary skews in class cardinalities and importances.
arXiv Detail & Related papers (2020-10-12T19:47:09Z) - Classification Performance Metric for Imbalance Data Based on Recall and
Selectivity Normalized in Class Labels [0.0]
We introduce a new performance measure based on the harmonic mean of Recall and Selectivity normalized in class labels.
This paper shows that the proposed performance measure has the right properties for the imbalanced dataset.
arXiv Detail & Related papers (2020-06-23T20:38:48Z) - M2m: Imbalanced Classification via Major-to-minor Translation [79.09018382489506]
In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion.
In this paper, we explore a novel yet simple way to alleviate this issue by augmenting less-frequent classes via translating samples from more-frequent classes.
Our experimental results on a variety of class-imbalanced datasets show that the proposed method improves the generalization on minority classes significantly compared to other existing re-sampling or re-weighting methods.
arXiv Detail & Related papers (2020-04-01T13:21:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.