iCost: A Novel Instance Complexity Based Cost-Sensitive Learning Framework
- URL: http://arxiv.org/abs/2409.13007v2
- Date: Fri, 25 Oct 2024 11:29:44 GMT
- Title: iCost: A Novel Instance Complexity Based Cost-Sensitive Learning Framework
- Authors: Asif Newaz, Asif Ur Rahman Adib, Taskeed Jabid,
- Abstract summary: Class imbalance in data presents significant challenges for classification tasks.
Traditional classification algorithms become biased toward the majority class.
We propose a novel instance complexity-based cost-sensitive approach (termed 'iCost') in this study.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Class imbalance in data presents significant challenges for classification tasks. It is fairly common and requires careful handling to obtain desirable performance. Traditional classification algorithms become biased toward the majority class. One way to alleviate the scenario is to make the classifiers cost-sensitive. This is achieved by assigning a higher misclassification cost to minority-class instances. One issue with this implementation is that all the minority-class instances are treated equally, and assigned with the same penalty value. However, the learning difficulties of all the instances are not the same. Instances that are located in the overlapping region or near the decision boundary are harder to classify, whereas those further away are easier. Without taking into consideration the instance complexity and naively weighting all the minority-class samples uniformly, results in an unwarranted bias and consequently, a higher number of misclassifications of the majority-class instances. This is undesirable and to overcome the situation, we propose a novel instance complexity-based cost-sensitive approach (termed 'iCost') in this study. We first categorize all the minority-class instances based on their difficulty level and then the instances are penalized accordingly. This ensures a more equitable instance weighting and prevents excessive penalization. The performance of the proposed approach is tested on 65 binary and 10 multiclass imbalanced datasets against the traditional cost-sensitive learning frameworks. A significant improvement in performance has been observed, demonstrating the effectiveness of the proposed strategy.
Related papers
- Confronting Discrimination in Classification: Smote Based on
Marginalized Minorities in the Kernel Space for Imbalanced Data [0.0]
We propose a novel classification oversampling approach based on the decision boundary and sample proximity relationships.
We test the proposed method on a classic financial fraud dataset.
arXiv Detail & Related papers (2024-02-13T04:03:09Z) - Class Uncertainty: A Measure to Mitigate Class Imbalance [0.0]
We show that considering solely the cardinality of classes does not cover all issues causing class imbalance.
We propose "Class Uncertainty" as the average predictive uncertainty of the training examples.
We also curate SVCI-20 as a novel dataset in which the classes have equal number of training examples but they differ in terms of their hardness.
arXiv Detail & Related papers (2023-11-23T16:36:03Z) - Uncertainty-guided Boundary Learning for Imbalanced Social Event
Detection [64.4350027428928]
We propose a novel uncertainty-guided class imbalance learning framework for imbalanced social event detection tasks.
Our model significantly improves social event representation and classification tasks in almost all classes, especially those uncertain ones.
arXiv Detail & Related papers (2023-10-30T03:32:04Z) - A Unified Generalization Analysis of Re-Weighting and Logit-Adjustment
for Imbalanced Learning [129.63326990812234]
We propose a technique named data-dependent contraction to capture how modified losses handle different classes.
On top of this technique, a fine-grained generalization bound is established for imbalanced learning, which helps reveal the mystery of re-weighting and logit-adjustment.
arXiv Detail & Related papers (2023-10-07T09:15:08Z) - A Universal Unbiased Method for Classification from Aggregate
Observations [115.20235020903992]
This paper presents a novel universal method of CFAO, which holds an unbiased estimator of the classification risk for arbitrary losses.
Our proposed method not only guarantees the risk consistency due to the unbiased risk estimator but also can be compatible with arbitrary losses.
arXiv Detail & Related papers (2023-06-20T07:22:01Z) - Review of Methods for Handling Class-Imbalanced in Classification
Problems [0.0]
In some cases, one class contains the majority of examples while the other, which is frequently the more important class, is nevertheless represented by a smaller proportion of examples.
The article examines the most widely used methods for addressing the problem of learning with a class imbalance, including data-level, algorithm-level, hybrid, cost-sensitive learning, and deep learning.
arXiv Detail & Related papers (2022-11-10T10:07:10Z) - Centrality and Consistency: Two-Stage Clean Samples Identification for
Learning with Instance-Dependent Noisy Labels [87.48541631675889]
We propose a two-stage clean samples identification method.
First, we employ a class-level feature clustering procedure for the early identification of clean samples.
Second, for the remaining clean samples that are close to the ground truth class boundary, we propose a novel consistency-based classification method.
arXiv Detail & Related papers (2022-07-29T04:54:57Z) - Mining Minority-class Examples With Uncertainty Estimates [102.814407678425]
In the real world, the frequency of occurrence of objects is naturally skewed forming long-tail class distributions.
We propose an effective, yet simple, approach to overcome these challenges.
Our framework enhances the subdued tail-class activations and, thereafter, uses a one-class data-centric approach to effectively identify tail-class examples.
arXiv Detail & Related papers (2021-12-15T02:05:02Z) - Train a One-Million-Way Instance Classifier for Unsupervised Visual
Representation Learning [45.510042484456854]
This paper presents a simple unsupervised visual representation learning method with a pretext task of discriminating all images in a dataset using a parametric, instance-level computation.
The overall framework is a replica of a supervised classification model, where semantic classes (e.g., dog, bird, and ship) are replaced by instance IDs.
scaling up the classification task from thousands of semantic labels to millions of instance labels brings specific challenges including 1) the large-scale softmax classifier; 2) the slow convergence due to the infrequent visiting of instance samples; and 3) the massive number of negative classes that can be noisy.
arXiv Detail & Related papers (2021-02-09T14:44:18Z) - M2m: Imbalanced Classification via Major-to-minor Translation [79.09018382489506]
In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion.
In this paper, we explore a novel yet simple way to alleviate this issue by augmenting less-frequent classes via translating samples from more-frequent classes.
Our experimental results on a variety of class-imbalanced datasets show that the proposed method improves the generalization on minority classes significantly compared to other existing re-sampling or re-weighting methods.
arXiv Detail & Related papers (2020-04-01T13:21:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.