Related papers: Imbalanced Class Data Performance Evaluation and Improvement using Novel Generative Adversarial Network-based Approach: SSG and GBO

Imbalanced Class Data Performance Evaluation and Improvement using Novel Generative Adversarial Network-based Approach: SSG and GBO

URL: http://arxiv.org/abs/2210.12870v1
Date: Sun, 23 Oct 2022 22:17:54 GMT
Title: Imbalanced Class Data Performance Evaluation and Improvement using Novel Generative Adversarial Network-based Approach: SSG and GBO
Authors: Md Manjurul Ahsan, Md Shahin Ali, and Zahed Siddique
Abstract summary: This study proposes two novel techniques: GAN-based Oversampling (GBO) and Support Vector Machine-SMOTE-GAN (SSG) The preliminary computational result shows that SSG and GBO performed better on the expanded imbalanced eight benchmark datasets than the original SMOTE.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Class imbalance in a dataset is one of the major challenges that can significantly impact the performance of machine learning models resulting in biased predictions. Numerous techniques have been proposed to address class imbalanced problems, including, but not limited to, Oversampling, Undersampling, and cost-sensitive approaches. Due to its ability to generate synthetic data, oversampling techniques such as the Synthetic Minority Oversampling Technique (SMOTE) is among the most widely used methodology by researchers. However, one of SMOTE's potential disadvantages is that newly created minor samples may overlap with major samples. As an effect, the probability of ML models' biased performance towards major classes increases. Recently, generative adversarial network (GAN) has garnered much attention due to its ability to create almost real samples. However, GAN is hard to train even though it has much potential. This study proposes two novel techniques: GAN-based Oversampling (GBO) and Support Vector Machine-SMOTE-GAN (SSG) to overcome the limitations of the existing oversampling approaches. The preliminary computational result shows that SSG and GBO performed better on the expanded imbalanced eight benchmark datasets than the original SMOTE. The study also revealed that the minor sample generated by SSG demonstrates Gaussian distributions, which is often difficult to achieve using original SMOTE.

Related papers

SMOGAN: Synthetic Minority Oversampling with GAN Refinement for Imbalanced Regression [0.0]
Imbalanced regression refers to prediction tasks where the target variable is skewed. This skewness hinders machine learning models, especially neural networks, which concentrate on dense regions. We propose SMOGAN, a two-step oversampling framework for imbalanced regression.
arXiv Detail & Related papers (2025-04-29T20:15:25Z)
Minimum Enclosing Ball Synthetic Minority Oversampling Technique from a Geometric Perspective [1.7851435784917604]
Class imbalance refers to the significant difference in the number of samples from different classes within a dataset. This issue is prevalent in real-world classification tasks, such as software defect prediction, medical diagnosis, and fraud detection. The synthetic minority oversampling technique (SMOTE) is widely used to address class imbalance issue. This paper proposes the Minimum Enclosing Ball SMOTE (MEB-SMOTE) method from a geometry perspective.
arXiv Detail & Related papers (2024-08-07T03:37:25Z)
GE-AdvGAN: Improving the transferability of adversarial samples by gradient editing-based adversarial generative model [69.71629949747884]
Adversarial generative models, such as Generative Adversarial Networks (GANs), are widely applied for generating various types of data. In this work, we propose a novel algorithm named GE-AdvGAN to enhance the transferability of adversarial samples.
arXiv Detail & Related papers (2024-01-11T16:43:16Z)
Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers. We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes. We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z)
BSGAN: A Novel Oversampling Technique for Imbalanced Pattern Recognitions [0.0]
Class imbalanced problems (CIP) are one of the potential challenges in developing unbiased Machine Learning (ML) models for predictions. CIP occurs when data samples are not equally distributed between the two or multiple classes. We propose a hybrid oversampling technique by combining the power of borderline SMOTE and Generative Adrial Network to generate more diverse data.
arXiv Detail & Related papers (2023-05-16T20:02:39Z)
Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data. We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z)
Generative Oversampling for Imbalanced Data via Majority-Guided VAE [15.93867386081279]
We propose a novel over-sampling model, called Majority-Guided VAE(MGVAE), which generates new minority samples under the guidance of a majority-based prior. In this way, the newly generated minority samples can inherit the diversity and richness of the majority ones, thus mitigating overfitting in downstream tasks.
arXiv Detail & Related papers (2023-02-14T06:35:23Z)
Towards Automated Imbalanced Learning with Deep Hierarchical Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class. Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class. We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z)
Scale-Equivalent Distillation for Semi-Supervised Object Detection [57.59525453301374]
Recent Semi-Supervised Object Detection (SS-OD) methods are mainly based on self-training, generating hard pseudo-labels by a teacher model on unlabeled data as supervisory signals. We analyze the challenges these methods meet with the empirical experiment results. We introduce a novel approach, Scale-Equivalent Distillation (SED), which is a simple yet effective end-to-end knowledge distillation framework robust to large object size variance and class imbalance.
arXiv Detail & Related papers (2022-03-23T07:33:37Z)
SMOTified-GAN for class imbalanced pattern classification problems [0.41998444721319217]
We propose a novel two-phase oversampling approach that has the synergy of SMOTE and GAN. The experimental results prove the sample quality of minority class(es) has been improved in a variety of tested benchmark datasets.
arXiv Detail & Related papers (2021-08-06T06:14:05Z)
Rethinking Sampling Strategies for Unsupervised Person Re-identification [59.47536050785886]
We analyze the reasons for the performance differences between various sampling strategies under the same framework and loss function. Group sampling is proposed, which gathers samples from the same class into groups. Experiments on Market-1501, DukeMTMC-reID and MSMT17 show that group sampling achieves performance comparable to state-of-the-art methods.
arXiv Detail & Related papers (2021-07-07T05:39:58Z)
Score-based Generative Modeling in Latent Space [93.8985523558869]
Score-based generative models (SGMs) have recently demonstrated impressive results in terms of both sample quality and distribution coverage. Here, we propose the Latent Score-based Generative Model (LSGM), a novel approach that trains SGMs in a latent space. Moving from data to latent space allows us to train more expressive generative models, apply SGMs to non-continuous data, and learn smoother SGMs in a smaller space.
arXiv Detail & Related papers (2021-06-10T17:26:35Z)
Conditional Wasserstein GAN-based Oversampling of Tabular Data for Imbalanced Learning [10.051309746913512]
We propose an oversampling method based on a conditional Wasserstein GAN. We benchmark our method against standard oversampling methods and the imbalanced baseline on seven real-world datasets.
arXiv Detail & Related papers (2020-08-20T20:33:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.