Related papers: The SAMME.C2 algorithm for severely imbalanced multi-class classification

The SAMME.C2 algorithm for severely imbalanced multi-class classification

URL: http://arxiv.org/abs/2112.14868v1
Date: Thu, 30 Dec 2021 00:20:01 GMT
Title: The SAMME.C2 algorithm for severely imbalanced multi-class classification
Authors: Banghee So and Emiliano A. Valdez
Abstract summary: A minority class is often considered the more interesting class yet developing a scientific learning algorithm suitable for the observations presents countless challenges. We suggest a novel multi-class classification algorithm specialized to handle severely imbalanced classes based on the method we refer to as SAMME.C2. Not only do we provide the resulting algorithm but we also establish scientific and statistical formulation of our proposed SAMME.C2 algorithm.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Classification predictive modeling involves the accurate assignment of observations in a dataset to target classes or categories. There is an increasing growth of real-world classification problems with severely imbalanced class distributions. In this case, minority classes have much fewer observations to learn from than those from majority classes. Despite this sparsity, a minority class is often considered the more interesting class yet developing a scientific learning algorithm suitable for the observations presents countless challenges. In this article, we suggest a novel multi-class classification algorithm specialized to handle severely imbalanced classes based on the method we refer to as SAMME.C2. It blends the flexible mechanics of the boosting techniques from SAMME algorithm, a multi-class classifier, and Ada.C2 algorithm, a cost-sensitive binary classifier designed to address highly class imbalances. Not only do we provide the resulting algorithm but we also establish scientific and statistical formulation of our proposed SAMME.C2 algorithm. Through numerical experiments examining various degrees of classifier difficulty, we demonstrate consistent superior performance of our proposed model.

Related papers

Addressing Class Imbalance with Probabilistic Graphical Models and Variational Inference [10.457756074328664]
This study proposes a method for imbalanced data classification based on deep probabilistic graphical models (DPGMs) We introduce variational inference optimization probability modeling, which enables the model to adaptively adjust the representation ability of minority classes. We combine the adversarial learning mechanism to generate minority class samples in the latent space so that the model can better characterize the category boundary.
arXiv Detail & Related papers (2025-04-08T07:38:30Z)
S3C: Self-Supervised Stochastic Classifiers for Few-Shot Class-Incremental Learning [22.243176199188238]
Few-shot class-incremental learning (FSCIL) aims to learn progressively about new classes with very few labeled samples, without forgetting the knowledge of already learnt classes. FSCIL suffers from two major challenges: (i) over-fitting on the new classes due to limited amount of data, (ii) catastrophically forgetting about the old classes due to unavailability of data from these classes in the incremental stages.
arXiv Detail & Related papers (2023-07-05T12:41:46Z)
Anomaly Detection using Ensemble Classification and Evidence Theory [62.997667081978825]
We present a novel approach for novel detection using ensemble classification and evidence theory. A pool selection strategy is presented to build a solid ensemble classifier. We use uncertainty for the anomaly detection approach.
arXiv Detail & Related papers (2022-12-23T00:50:41Z)
Parametric Classification for Generalized Category Discovery: A Baseline Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples. We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem. We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z)
Multi-class Classification with Fuzzy-feature Observations: Theory and Algorithms [36.810603503167755]
We propose a novel framework to address a new realistic problem called multi-class classification with imprecise observations (MCIMO) First, we give the theoretical analysis of the MCIMO problem based on fuzzy Rademacher complexity. Then, two practical algorithms based on support vector machine and neural networks are constructed to solve the proposed new problem.
arXiv Detail & Related papers (2022-06-09T07:14:00Z)
Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels. When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new. In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z)
Divide-and-Conquer Hard-thresholding Rules in High-dimensional Imbalanced Classification [1.0312968200748118]
We study the impact of imbalance class sizes on the linear discriminant analysis (LDA) in high dimensions. We show that due to data scarcity in one class, referred to as the minority class, the LDA ignores the minority class yielding a maximum misclassification rate. We propose a new construction of a hard-conquering rule based on a divide-and-conquer technique that reduces the large difference between the misclassification rates.
arXiv Detail & Related papers (2021-11-05T07:44:28Z)
Statistical Theory for Imbalanced Binary Classification [8.93993657323783]
We show that optimal classification performance depends on certain properties of class imbalance that have not previously been formalized. Specifically, we propose a novel sub-type of class imbalance, which we call Uniform Class Imbalance. These results provide some of the first meaningful finite-sample statistical theory for imbalanced binary classification.
arXiv Detail & Related papers (2021-07-05T03:55:43Z)
Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification. Our analysis reveals that the classification accuracy is highly distribution-dependent. The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z)
Learning and Evaluating Representations for Deep One-class Classification [59.095144932794646]
We present a two-stage framework for deep one-class classification. We first learn self-supervised representations from one-class data, and then build one-class classifiers on learned representations. In experiments, we demonstrate state-of-the-art performance on visual domain one-class classification benchmarks.
arXiv Detail & Related papers (2020-11-04T23:33:41Z)
M2m: Imbalanced Classification via Major-to-minor Translation [79.09018382489506]
In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion. In this paper, we explore a novel yet simple way to alleviate this issue by augmenting less-frequent classes via translating samples from more-frequent classes. Our experimental results on a variety of class-imbalanced datasets show that the proposed method improves the generalization on minority classes significantly compared to other existing re-sampling or re-weighting methods.
arXiv Detail & Related papers (2020-04-01T13:21:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.