Imbalanced Class Data Performance Evaluation and Improvement using Novel
Generative Adversarial Network-based Approach: SSG and GBO
- URL: http://arxiv.org/abs/2210.12870v1
- Date: Sun, 23 Oct 2022 22:17:54 GMT
- Title: Imbalanced Class Data Performance Evaluation and Improvement using Novel
Generative Adversarial Network-based Approach: SSG and GBO
- Authors: Md Manjurul Ahsan, Md Shahin Ali, and Zahed Siddique
- Abstract summary: This study proposes two novel techniques: GAN-based Oversampling (GBO) and Support Vector Machine-SMOTE-GAN (SSG)
The preliminary computational result shows that SSG and GBO performed better on the expanded imbalanced eight benchmark datasets than the original SMOTE.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Class imbalance in a dataset is one of the major challenges that can
significantly impact the performance of machine learning models resulting in
biased predictions. Numerous techniques have been proposed to address class
imbalanced problems, including, but not limited to, Oversampling,
Undersampling, and cost-sensitive approaches. Due to its ability to generate
synthetic data, oversampling techniques such as the Synthetic Minority
Oversampling Technique (SMOTE) is among the most widely used methodology by
researchers. However, one of SMOTE's potential disadvantages is that newly
created minor samples may overlap with major samples. As an effect, the
probability of ML models' biased performance towards major classes increases.
Recently, generative adversarial network (GAN) has garnered much attention due
to its ability to create almost real samples. However, GAN is hard to train
even though it has much potential. This study proposes two novel techniques:
GAN-based Oversampling (GBO) and Support Vector Machine-SMOTE-GAN (SSG) to
overcome the limitations of the existing oversampling approaches. The
preliminary computational result shows that SSG and GBO performed better on the
expanded imbalanced eight benchmark datasets than the original SMOTE. The study
also revealed that the minor sample generated by SSG demonstrates Gaussian
distributions, which is often difficult to achieve using original SMOTE.
Related papers
- Minimum Enclosing Ball Synthetic Minority Oversampling Technique from a Geometric Perspective [1.7851435784917604]
Class imbalance refers to the significant difference in the number of samples from different classes within a dataset.
This issue is prevalent in real-world classification tasks, such as software defect prediction, medical diagnosis, and fraud detection.
The synthetic minority oversampling technique (SMOTE) is widely used to address class imbalance issue.
This paper proposes the Minimum Enclosing Ball SMOTE (MEB-SMOTE) method from a geometry perspective.
arXiv Detail & Related papers (2024-08-07T03:37:25Z) - GE-AdvGAN: Improving the transferability of adversarial samples by
gradient editing-based adversarial generative model [69.71629949747884]
Adversarial generative models, such as Generative Adversarial Networks (GANs), are widely applied for generating various types of data.
In this work, we propose a novel algorithm named GE-AdvGAN to enhance the transferability of adversarial samples.
arXiv Detail & Related papers (2024-01-11T16:43:16Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - BSGAN: A Novel Oversampling Technique for Imbalanced Pattern
Recognitions [0.0]
Class imbalanced problems (CIP) are one of the potential challenges in developing unbiased Machine Learning (ML) models for predictions.
CIP occurs when data samples are not equally distributed between the two or multiple classes.
We propose a hybrid oversampling technique by combining the power of borderline SMOTE and Generative Adrial Network to generate more diverse data.
arXiv Detail & Related papers (2023-05-16T20:02:39Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Generative Oversampling for Imbalanced Data via Majority-Guided VAE [15.93867386081279]
We propose a novel over-sampling model, called Majority-Guided VAE(MGVAE), which generates new minority samples under the guidance of a majority-based prior.
In this way, the newly generated minority samples can inherit the diversity and richness of the majority ones, thus mitigating overfitting in downstream tasks.
arXiv Detail & Related papers (2023-02-14T06:35:23Z) - Towards Automated Imbalanced Learning with Deep Hierarchical
Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class.
Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class.
We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z) - Scale-Equivalent Distillation for Semi-Supervised Object Detection [57.59525453301374]
Recent Semi-Supervised Object Detection (SS-OD) methods are mainly based on self-training, generating hard pseudo-labels by a teacher model on unlabeled data as supervisory signals.
We analyze the challenges these methods meet with the empirical experiment results.
We introduce a novel approach, Scale-Equivalent Distillation (SED), which is a simple yet effective end-to-end knowledge distillation framework robust to large object size variance and class imbalance.
arXiv Detail & Related papers (2022-03-23T07:33:37Z) - SMOTified-GAN for class imbalanced pattern classification problems [0.41998444721319217]
We propose a novel two-phase oversampling approach that has the synergy of SMOTE and GAN.
The experimental results prove the sample quality of minority class(es) has been improved in a variety of tested benchmark datasets.
arXiv Detail & Related papers (2021-08-06T06:14:05Z) - Score-based Generative Modeling in Latent Space [93.8985523558869]
Score-based generative models (SGMs) have recently demonstrated impressive results in terms of both sample quality and distribution coverage.
Here, we propose the Latent Score-based Generative Model (LSGM), a novel approach that trains SGMs in a latent space.
Moving from data to latent space allows us to train more expressive generative models, apply SGMs to non-continuous data, and learn smoother SGMs in a smaller space.
arXiv Detail & Related papers (2021-06-10T17:26:35Z) - Conditional Wasserstein GAN-based Oversampling of Tabular Data for
Imbalanced Learning [10.051309746913512]
We propose an oversampling method based on a conditional Wasserstein GAN.
We benchmark our method against standard oversampling methods and the imbalanced baseline on seven real-world datasets.
arXiv Detail & Related papers (2020-08-20T20:33:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.