Related papers: Intrinsic Dimensionality as a Model-Free Measure of Class Imbalance

Intrinsic Dimensionality as a Model-Free Measure of Class Imbalance

URL: http://arxiv.org/abs/2511.10475v1
Date: Fri, 14 Nov 2025 01:53:19 GMT
Title: Intrinsic Dimensionality as a Model-Free Measure of Class Imbalance
Authors: Çağrı Eser, Zeynep Sonat Baltacı, Emre Akbaş, Sinan Kalkan,
Abstract summary: Imbalance in classification tasks is commonly quantified by the cardinalities of examples across classes.<n>This disregards the presence of redundant examples and inherent differences in the learning difficulties of classes.<n>Our paper proposes using data Intrinsic Dimensionality (ID) as an easy-to-compute, model-free measure of imbalance.
Score: 8.819673391477036
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Imbalance in classification tasks is commonly quantified by the cardinalities of examples across classes. This, however, disregards the presence of redundant examples and inherent differences in the learning difficulties of classes. Alternatively, one can use complex measures such as training loss and uncertainty, which, however, depend on training a machine learning model. Our paper proposes using data Intrinsic Dimensionality (ID) as an easy-to-compute, model-free measure of imbalance that can be seamlessly incorporated into various imbalance mitigation methods. Our results across five different datasets with a diverse range of imbalance ratios show that ID consistently outperforms cardinality-based re-weighting and re-sampling techniques used in the literature. Moreover, we show that combining ID with cardinality can further improve performance. Code: https://github.com/cagries/IDIM.

Related papers

A replica analysis of under-bagging [3.1274367448459253]
Under-bagging (UB) is a popular ensemble learning method for training classifiers on an imbalanced data. Using bagging to reduce the increased variance caused by the reduction in sample size due to under-sampling is a natural approach. It has recently been pointed out that in generalized linear models, naive bagging, which does not consider the class imbalance structure, and ridge regularization can produce the same results.
arXiv Detail & Related papers (2024-04-15T13:31:31Z)
Class Uncertainty: A Measure to Mitigate Class Imbalance [0.0]
We show that considering solely the cardinality of classes does not cover all issues causing class imbalance.<n>We propose "Class Uncertainty" as the average predictive uncertainty of the training examples.<n>We also curate SVCI-20 as a novel dataset in which the classes have equal number of training examples but they differ in terms of their hardness.
arXiv Detail & Related papers (2023-11-23T16:36:03Z)
Instance-specific and Model-adaptive Supervision for Semi-supervised Semantic Segmentation [49.82432158155329]
We propose an instance-specific and model-adaptive supervision for semi-supervised semantic segmentation, named iMAS. iMAS learns from unlabeled instances progressively by weighing their corresponding consistency losses based on the evaluated hardness.
arXiv Detail & Related papers (2022-11-21T10:37:28Z)
Class-Imbalanced Complementary-Label Learning via Weighted Loss [8.934943507699131]
Complementary-label learning (CLL) is widely used in weakly supervised classification. It faces a significant challenge in real-world datasets when confronted with class-imbalanced training samples. We propose a novel problem setting that enables learning from class-imbalanced complementary labels for multi-class classification.
arXiv Detail & Related papers (2022-09-28T16:02:42Z)
Meta-Causal Feature Learning for Out-of-Distribution Generalization [71.38239243414091]
This paper presents a balanced meta-causal learner (BMCL), which includes a balanced task generation module (BTG) and a meta-causal feature learning module (MCFL) BMCL effectively identifies the class-invariant visual regions for classification and may serve as a general framework to improve the performance of the state-of-the-art methods.
arXiv Detail & Related papers (2022-08-22T09:07:02Z)
Constructing Balance from Imbalance for Long-tailed Image Recognition [50.6210415377178]
The imbalance between majority (head) classes and minority (tail) classes severely skews the data-driven deep neural networks. Previous methods tackle with data imbalance from the viewpoints of data distribution, feature space, and model design. We propose a concise paradigm by progressively adjusting label space and dividing the head classes and tail classes. Our proposed model also provides a feature evaluation method and paves the way for long-tailed feature learning.
arXiv Detail & Related papers (2022-08-04T10:22:24Z)
CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance. Sample re-weighting methods are popularly used to alleviate this data bias issue. We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z)
Intra-Class Uncertainty Loss Function for Classification [6.523198497365588]
intra-class uncertainty/variability is not considered, especially for datasets containing unbalanced classes. In our framework, the features extracted by deep networks of each class are characterized by independent Gaussian distribution. The proposed approach shows improved classification performance, through learning a better class representation.
arXiv Detail & Related papers (2021-04-12T09:02:41Z)
Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification. Our analysis reveals that the classification accuracy is highly distribution-dependent. The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z)
Classification Performance Metric for Imbalance Data Based on Recall and Selectivity Normalized in Class Labels [0.0]
We introduce a new performance measure based on the harmonic mean of Recall and Selectivity normalized in class labels. This paper shows that the proposed performance measure has the right properties for the imbalanced dataset.
arXiv Detail & Related papers (2020-06-23T20:38:48Z)
Long-Tailed Recognition Using Class-Balanced Experts [128.73438243408393]
We propose an ensemble of class-balanced experts that combines the strength of diverse classifiers. Our ensemble of class-balanced experts reaches results close to state-of-the-art and an extended ensemble establishes a new state-of-the-art on two benchmarks for long-tailed recognition.
arXiv Detail & Related papers (2020-04-07T20:57:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.