Classification Performance Metric for Imbalance Data Based on Recall and
Selectivity Normalized in Class Labels
- URL: http://arxiv.org/abs/2006.13319v1
- Date: Tue, 23 Jun 2020 20:38:48 GMT
- Title: Classification Performance Metric for Imbalance Data Based on Recall and
Selectivity Normalized in Class Labels
- Authors: Robert Burduk
- Abstract summary: We introduce a new performance measure based on the harmonic mean of Recall and Selectivity normalized in class labels.
This paper shows that the proposed performance measure has the right properties for the imbalanced dataset.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the classification of a class imbalance dataset, the performance measure
used for the model selection and comparison to competing methods is a major
issue. In order to overcome this problem several performance measures are
defined and analyzed in several perspectives regarding in particular the
imbalance ratio. There is still no clear indication which metric is universal
and can be used for any skewed data problem. In this paper we introduced a new
performance measure based on the harmonic mean of Recall and Selectivity
normalized in class labels. This paper shows that the proposed performance
measure has the right properties for the imbalanced dataset. In particular, in
the space defined by the majority class examples and imbalance ratio it is less
sensitive to changes in the majority class and more sensitive to changes in the
minority class compared with other existing single-value performance measures.
Additionally, the identity of the other performance measures has been proven
analytically.
Related papers
- Improving the classification of extreme classes by means of loss regularisation and generalised beta distributions [8.640930010669042]
We propose a unimodal regularisation approach to improve the classification performance of the first and last classes.
Performance in the extreme classes is compared using a new metric that takes into account their sensitivities.
The results for the proposed metric show that the generalised beta distribution generally improves classification performance in the extreme classes.
arXiv Detail & Related papers (2024-07-17T08:57:42Z) - $F_β$-plot -- a visual tool for evaluating imbalanced data classifiers [0.0]
The paper proposes a simple approach to analyzing the popular parametric metric $F_beta$.
It is possible to indicate for a given pool of analyzed classifiers when a given model should be preferred depending on user requirements.
arXiv Detail & Related papers (2024-04-11T18:07:57Z) - Revisiting Long-tailed Image Classification: Survey and Benchmarks with
New Evaluation Metrics [88.39382177059747]
A corpus of metrics is designed for measuring the accuracy, robustness, and bounds of algorithms for learning with long-tailed distribution.
Based on our benchmarks, we re-evaluate the performance of existing methods on CIFAR10 and CIFAR100 datasets.
arXiv Detail & Related papers (2023-02-03T02:40:54Z) - PercentMatch: Percentile-based Dynamic Thresholding for Multi-Label
Semi-Supervised Classification [64.39761523935613]
We propose a percentile-based threshold adjusting scheme to dynamically alter the score thresholds of positive and negative pseudo-labels for each class during the training.
We achieve strong performance on Pascal VOC2007 and MS-COCO datasets when compared to recent SSL methods.
arXiv Detail & Related papers (2022-08-30T01:27:48Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Statistical Theory for Imbalanced Binary Classification [8.93993657323783]
We show that optimal classification performance depends on certain properties of class imbalance that have not previously been formalized.
Specifically, we propose a novel sub-type of class imbalance, which we call Uniform Class Imbalance.
These results provide some of the first meaningful finite-sample statistical theory for imbalanced binary classification.
arXiv Detail & Related papers (2021-07-05T03:55:43Z) - PLM: Partial Label Masking for Imbalanced Multi-label Classification [59.68444804243782]
Neural networks trained on real-world datasets with long-tailed label distributions are biased towards frequent classes and perform poorly on infrequent classes.
We propose a method, Partial Label Masking (PLM), which utilizes this ratio during training.
Our method achieves strong performance when compared to existing methods on both multi-label (MultiMNIST and MSCOCO) and single-label (imbalanced CIFAR-10 and CIFAR-100) image classification datasets.
arXiv Detail & Related papers (2021-05-22T18:07:56Z) - A Skew-Sensitive Evaluation Framework for Imbalanced Data Classification [11.125446871030734]
Class distribution skews in imbalanced datasets may lead to models with prediction bias towards majority classes.
We propose a simple and general-purpose evaluation framework for imbalanced data classification that is sensitive to arbitrary skews in class cardinalities and importances.
arXiv Detail & Related papers (2020-10-12T19:47:09Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - M2m: Imbalanced Classification via Major-to-minor Translation [79.09018382489506]
In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion.
In this paper, we explore a novel yet simple way to alleviate this issue by augmenting less-frequent classes via translating samples from more-frequent classes.
Our experimental results on a variety of class-imbalanced datasets show that the proposed method improves the generalization on minority classes significantly compared to other existing re-sampling or re-weighting methods.
arXiv Detail & Related papers (2020-04-01T13:21:17Z) - On Model Evaluation under Non-constant Class Imbalance [0.0]
Many real-world classification problems are significantly class-imbalanced to detriment of the class of interest.
The usual assumption is that the test dataset imbalance equals the real-world imbalance.
We introduce methods focusing on evaluation under non-constant class imbalance.
arXiv Detail & Related papers (2020-01-15T21:52:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.