SetConv: A New Approach for Learning from Imbalanced Data
- URL: http://arxiv.org/abs/2104.06313v1
- Date: Sat, 3 Apr 2021 22:33:30 GMT
- Title: SetConv: A New Approach for Learning from Imbalanced Data
- Authors: Yang Gao, Yi-Fan Li, Yu Lin, Charu Aggarwal, Latifur Khan
- Abstract summary: We propose a set convolution operation and an episodic training strategy to extract a single representative for each class.
We prove that our proposed algorithm is permutation-invariant despite the order of inputs.
- Score: 29.366843553056594
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: For many real-world classification problems, e.g., sentiment classification,
most existing machine learning methods are biased towards the majority class
when the Imbalance Ratio (IR) is high. To address this problem, we propose a
set convolution (SetConv) operation and an episodic training strategy to
extract a single representative for each class, so that classifiers can later
be trained on a balanced class distribution. We prove that our proposed
algorithm is permutation-invariant despite the order of inputs, and experiments
on multiple large-scale benchmark text datasets show the superiority of our
proposed framework when compared to other SOTA methods.
Related papers
- Methods for Class-Imbalanced Learning with Support Vector Machines: A Review and an Empirical Evaluation [22.12895887111828]
We introduce a hierarchical categorization of SVM-based models with respect to class-imbalanced learning.
We compare the performances of various representative SVM-based models in each category using benchmark imbalanced data sets.
Our findings reveal that while algorithmic methods are less time-consuming owing to no data pre-processing requirements, fusion methods, which combine both re-sampling and algorithmic approaches, generally perform the best.
arXiv Detail & Related papers (2024-06-05T15:55:08Z) - Mutual Exclusive Modulator for Long-Tailed Recognition [12.706961256329572]
Long-tailed recognition is the task of learning high-performance classifiers given extremely imbalanced training samples between categories.
We introduce a mutual exclusive modulator which can estimate the probability of an image belonging to each group.
Our method achieves competitive performance compared to the state-of-the-art benchmarks.
arXiv Detail & Related papers (2023-02-19T07:31:49Z) - Revisiting Long-tailed Image Classification: Survey and Benchmarks with
New Evaluation Metrics [88.39382177059747]
A corpus of metrics is designed for measuring the accuracy, robustness, and bounds of algorithms for learning with long-tailed distribution.
Based on our benchmarks, we re-evaluate the performance of existing methods on CIFAR10 and CIFAR100 datasets.
arXiv Detail & Related papers (2023-02-03T02:40:54Z) - Ensemble Classifier Design Tuned to Dataset Characteristics for Network
Intrusion Detection [0.0]
Two new algorithms are proposed to address the class overlap issue in the dataset.
The proposed design is evaluated for both binary and multi-category classification.
arXiv Detail & Related papers (2022-05-08T21:06:42Z) - Class-Incremental Learning with Strong Pre-trained Models [97.84755144148535]
Class-incremental learning (CIL) has been widely studied under the setting of starting from a small number of classes (base classes)
We explore an understudied real-world setting of CIL that starts with a strong model pre-trained on a large number of base classes.
Our proposed method is robust and generalizes to all analyzed CIL settings.
arXiv Detail & Related papers (2022-04-07T17:58:07Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Hybrid Ensemble optimized algorithm based on Genetic Programming for
imbalanced data classification [0.0]
We propose a hybrid ensemble algorithm based on Genetic Programming (GP) for two classes of imbalanced data classification.
Experimental results show the performance of the proposed method on the specified data sets in the size of the training set shows 40% and 50% better accuracy than other dimensions of the minority class prediction.
arXiv Detail & Related papers (2021-06-02T14:14:38Z) - Binary Classification from Multiple Unlabeled Datasets via Surrogate Set
Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$.
Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z) - M2m: Imbalanced Classification via Major-to-minor Translation [79.09018382489506]
In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion.
In this paper, we explore a novel yet simple way to alleviate this issue by augmenting less-frequent classes via translating samples from more-frequent classes.
Our experimental results on a variety of class-imbalanced datasets show that the proposed method improves the generalization on minority classes significantly compared to other existing re-sampling or re-weighting methods.
arXiv Detail & Related papers (2020-04-01T13:21:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.