OOG- Optuna Optimized GAN Sampling Technique for Tabular Imbalanced
Malware Data
- URL: http://arxiv.org/abs/2212.01274v1
- Date: Fri, 25 Nov 2022 16:59:30 GMT
- Title: OOG- Optuna Optimized GAN Sampling Technique for Tabular Imbalanced
Malware Data
- Authors: S.M Towhidul Islam Tonmoy and S.M Mehedi Zaman
- Abstract summary: Generative Adversarial Network (GAN) sampling technique has been used in this study to generate new malware samples.
In this study, the architecture of the Optuna Optimized GAN (OOG) method is shown, along with scores of 98.06%, 99.0%, 97.23%, and 98.04% for accuracy, precision, recall and f1 score respectively.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cyberspace occupies a large portion of people's life in the age of modern
technology, and while there are those who utilize it for good, there are also
those who do not. Malware is an application whose construction was not
motivated by a benign goal and it can harm, steal, or even alter personal
information and secure applications and software. Thus, there are numerous
techniques to avoid malware, one of which is to develop samples of malware so
that the system can be updated with the growing number of malwares, allowing it
to recognize when malwares attempt to enter. The Generative Adversarial Network
(GAN) sampling technique has been used in this study to generate new malware
samples. GANs have multiple variants, and in order to determine which variant
is optimal for a given dataset sample, their parameters must be modified. This
study employs Optuna, an autonomous hyperparameter tuning algorithm, to
determine the optimal settings for the dataset under consideration. In this
study, the architecture of the Optuna Optimized GAN (OOG) method is shown,
along with scores of 98.06%, 99.00%, 97.23%, and 98.04% for accuracy,
precision, recall and f1 score respectively. After tweaking the hyperparameters
of five supervised boosting algorithms, XGBoost, LightGBM, CatBoost, Extra
Trees Classifier, and Gradient Boosting Classifier, the methodology of this
paper additionally employs the weighted ensemble technique to acquire this
result. In addition to comparing existing efforts in this domain, the study
demonstrates how promising GAN is in comparison to other sampling techniques
such as SMOTE.
Related papers
- Scalable and Effective Negative Sample Generation for Hyperedge Prediction [55.9298019975967]
Hyperedge prediction is crucial for understanding complex multi-entity interactions in web-based applications.
Traditional methods often face difficulties in generating high-quality negative samples due to imbalance between positive and negative instances.
We present the scalable and effective negative sample generation for Hyperedge Prediction (SEHP) framework, which utilizes diffusion models to tackle these challenges.
arXiv Detail & Related papers (2024-11-19T09:16:25Z) - A New Formulation for Zeroth-Order Optimization of Adversarial EXEmples in Malware Detection [14.786557372850094]
We show how learning malware detectors can be cast within a zeroth-order optimization framework.
We propose and study ZEXE, a novel zero-order attack against Windows malware detection.
arXiv Detail & Related papers (2024-05-23T13:01:36Z) - GE-AdvGAN: Improving the transferability of adversarial samples by
gradient editing-based adversarial generative model [69.71629949747884]
Adversarial generative models, such as Generative Adversarial Networks (GANs), are widely applied for generating various types of data.
In this work, we propose a novel algorithm named GE-AdvGAN to enhance the transferability of adversarial samples.
arXiv Detail & Related papers (2024-01-11T16:43:16Z) - A Comparison of Adversarial Learning Techniques for Malware Detection [1.2289361708127875]
We use gradient-based, evolutionary algorithm-based, and reinforcement-based methods to generate adversarial samples.
Experiments show that the Gym-malware generator, which uses a reinforcement learning approach, has the greatest practical potential.
arXiv Detail & Related papers (2023-08-19T09:22:32Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - An Empirical Evaluation of Zeroth-Order Optimization Methods on
AI-driven Molecule Optimization [78.36413169647408]
We study the effectiveness of various ZO optimization methods for optimizing molecular objectives.
We show the advantages of ZO sign-based gradient descent (ZO-signGD)
We demonstrate the potential effectiveness of ZO optimization methods on widely used benchmark tasks from the Guacamol suite.
arXiv Detail & Related papers (2022-10-27T01:58:10Z) - Towards Automated Imbalanced Learning with Deep Hierarchical
Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class.
Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class.
We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z) - Targeted Attack against Deep Neural Networks via Flipping Limited Weight
Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes.
Our goal is to misclassify a specific sample into a target class without any sample modification.
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z) - The Integrity of Machine Learning Algorithms against Software Defect
Prediction [0.0]
This report analyses the performance of the Online Sequential Extreme Learning Machine (OS-ELM) proposed by Liang et.al.
OS-ELM trains faster than conventional deep neural networks and it always converges to the globally optimal solution.
The analysis is carried out on 3 projects KC1, PC4 and PC3 carried out by the NASA group.
arXiv Detail & Related papers (2020-09-05T17:26:56Z) - MDEA: Malware Detection with Evolutionary Adversarial Learning [16.8615211682877]
MDEA, an Adversarial Malware Detection model uses evolutionary optimization to create attack samples to make the network robust against evasion attacks.
By retraining the model with the evolved malware samples, its performance improves a significant margin.
arXiv Detail & Related papers (2020-02-09T09:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.