Related papers: Deep Synthetic Minority Over-Sampling Technique

Deep Synthetic Minority Over-Sampling Technique

URL: http://arxiv.org/abs/2003.09788v1
Date: Sun, 22 Mar 2020 02:44:46 GMT
Title: Deep Synthetic Minority Over-Sampling Technique
Authors: Hadi Mansourifar, Weidong Shi
Abstract summary: We adapt the SMOTE idea in deep learning architecture. Deep SMOTE can outperform traditional SMOTE in terms of precision, F1 score and Area Under Curve (AUC) in majority of test cases.
Score: 3.3707422585608953
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Synthetic Minority Over-sampling Technique (SMOTE) is the most popular over-sampling method. However, its random nature makes the synthesized data and even imbalanced classification results unstable. It means that in case of running SMOTE n different times, n different synthesized in-stances are obtained with n different classification results. To address this problem, we adapt the SMOTE idea in deep learning architecture. In this method, a deep neural network regression model is used to train the inputs and outputs of traditional SMOTE. Inputs of the proposed deep regression model are two randomly chosen data points which are concatenated to form a double size vector. The outputs of this model are corresponding randomly interpolated data points between two randomly chosen vectors with original dimension. The experimental results show that, Deep SMOTE can outperform traditional SMOTE in terms of precision, F1 score and Area Under Curve (AUC) in majority of test cases.

Related papers

Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations. In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z)
Towards Automated Imbalanced Learning with Deep Hierarchical Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class. Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class. We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z)
CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator [60.799183326613395]
We propose an unbiased estimator for categorical random variables based on multiple mutually negatively correlated (jointly antithetic) samples. CARMS combines REINFORCE with copula based sampling to avoid duplicate samples and reduce its variance, while keeping the estimator unbiased using importance sampling. We evaluate CARMS on several benchmark datasets on a generative modeling task, as well as a structured output prediction task, and find it to outperform competing methods including a strong self-control baseline.
arXiv Detail & Related papers (2021-10-26T20:14:30Z)
SMOTified-GAN for class imbalanced pattern classification problems [0.41998444721319217]
We propose a novel two-phase oversampling approach that has the synergy of SMOTE and GAN. The experimental results prove the sample quality of minority class(es) has been improved in a variety of tested benchmark datasets.
arXiv Detail & Related papers (2021-08-06T06:14:05Z)
A multi-schematic classifier-independent oversampling approach for imbalanced datasets [0.0]
It is evident from previous studies that different oversampling algorithms have different degrees of efficiency with different classifiers. Here, we overcome this problem with a multi-schematic and classifier-independent oversampling approach: ProWRAS. ProWRAS integrates the Localized Random Affine Shadowsampling (LoRAS)algorithm and the Proximity Weighted Synthetic oversampling (ProWSyn) algorithm.
arXiv Detail & Related papers (2021-07-15T14:03:24Z)
ARMS: Antithetic-REINFORCE-Multi-Sample Gradient for Binary Variables [60.799183326613395]
Antithetic REINFORCE-based Multi-Sample gradient estimator. ARMS uses a copula to generate any number of mutually antithetic samples. We evaluate ARMS on several datasets for training generative models, and our experimental results show that it outperforms competing methods.
arXiv Detail & Related papers (2021-05-28T23:19:54Z)
GMOTE: Gaussian based minority oversampling technique for imbalanced classification adapting tail probability of outliers [0.0]
Data-level approaches mainly use the oversampling methods to solve the problem, such as synthetic minority oversampling Technique (SMOTE) In this paper, we proposed Gaussian based minority oversampling technique (GMOTE) with a statistical perspective for imbalanced datasets. When the GMOTE is combined with classification and regression tree (CART) or support vector machine (SVM), it shows better accuracy and F1-Score.
arXiv Detail & Related papers (2021-05-09T07:04:37Z)
SMOTE-ENC: A novel SMOTE-based method to generate synthetic data for nominal and continuous features [0.38073142980733]
We present a novel minority over-sampling method, SMOTE-ENC (SMOTE - Encoded Nominal and Continuous) Our experiments show that the classification model using SMOTE-ENC method offers better prediction than model using SMOTE-NC. Our proposed method addressed one of the major limitations of SMOTE-NC algorithm.
arXiv Detail & Related papers (2021-03-13T04:16:17Z)
The Integrity of Machine Learning Algorithms against Software Defect Prediction [0.0]
This report analyses the performance of the Online Sequential Extreme Learning Machine (OS-ELM) proposed by Liang et.al. OS-ELM trains faster than conventional deep neural networks and it always converges to the globally optimal solution. The analysis is carried out on 3 projects KC1, PC4 and PC3 carried out by the NASA group.
arXiv Detail & Related papers (2020-09-05T17:26:56Z)
RAIN: A Simple Approach for Robust and Accurate Image Classification Networks [156.09526491791772]
It has been shown that the majority of existing adversarial defense methods achieve robustness at the cost of sacrificing prediction accuracy. This paper proposes a novel preprocessing framework, which we term Robust and Accurate Image classificatioN(RAIN) RAIN applies randomization over inputs to break the ties between the model forward prediction path and the backward gradient path, thus improving the model robustness. We conduct extensive experiments on the STL10 and ImageNet datasets to verify the effectiveness of RAIN against various types of adversarial attacks.
arXiv Detail & Related papers (2020-04-24T02:03:56Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.