Improving Android Malware Detection Through Data Augmentation Using
Wasserstein Generative Adversarial Networks
- URL: http://arxiv.org/abs/2403.00890v2
- Date: Tue, 5 Mar 2024 14:33:33 GMT
- Title: Improving Android Malware Detection Through Data Augmentation Using
Wasserstein Generative Adversarial Networks
- Authors: Kawana Stalin, Mikias Berhanu Mekoya
- Abstract summary: Generative Adversarial Networks (GANs) have demonstrated their versatility across various applications.
This research explores the effectiveness of utilizing GAN-generated data to train a model for the detection of Android malware.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative Adversarial Networks (GANs) have demonstrated their versatility
across various applications, including data augmentation and malware detection.
This research explores the effectiveness of utilizing GAN-generated data to
train a model for the detection of Android malware. Given the considerable
storage requirements of Android applications, the study proposes a method to
synthetically represent data using GANs, thereby reducing storage demands. The
proposed methodology involves creating image representations of features
extracted from an existing dataset. A GAN model is then employed to generate a
more extensive dataset consisting of realistic synthetic grayscale images.
Subsequently, this synthetic dataset is utilized to train a Convolutional
Neural Network (CNN) designed to identify previously unseen Android malware
applications. The study includes a comparative analysis of the CNN's
performance when trained on real images versus synthetic images generated by
the GAN. Furthermore, the research explores variations in performance between
the Wasserstein Generative Adversarial Network (WGAN) and the Deep
Convolutional Generative Adversarial Network (DCGAN). The investigation extends
to studying the impact of image size and malware obfuscation on the
classification model's effectiveness. The data augmentation approach
implemented in this study resulted in a notable performance enhancement of the
classification model, ranging from 1.5% to 7%, depending on the dataset. The
highest achieved F1 score reached 0.975.
Keywords--Generative Adversarial Networks, Android Malware, Data
Augmentation, Wasserstein Generative Adversarial Network
Related papers
- Enhancing Network Intrusion Detection Performance using Generative Adversarial Networks [0.25163931116642785]
We propose a novel approach for enhancing the performance of an NIDS through the integration of Generative Adversarial Networks (GANs)
GANs generate synthetic network traffic data that closely mimics real-world network behavior.
Our findings show that the integration of GANs into NIDS can lead to enhancements in intrusion detection performance for attacks with limited training data.
arXiv Detail & Related papers (2024-04-11T04:01:15Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable.
Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology.
We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - ADASR: An Adversarial Auto-Augmentation Framework for Hyperspectral and
Multispectral Data Fusion [54.668445421149364]
Deep learning-based hyperspectral image (HSI) super-resolution aims to generate high spatial resolution HSI (HR-HSI) by fusing hyperspectral image (HSI) and multispectral image (MSI) with deep neural networks (DNNs)
In this letter, we propose a novel adversarial automatic data augmentation framework ADASR that automatically optimize and augments HSI-MSI sample pairs to enrich data diversity for HSI-MSI fusion.
arXiv Detail & Related papers (2023-10-11T07:30:37Z) - Generative Adversarial Networks for Data Augmentation [0.0]
GANs have been utilized in medical image analysis for various tasks, including data augmentation, image creation, and domain adaptation.
GANs can generate synthetic samples that can be used to increase the available dataset.
It is essential to note that the use of GANs in medical imaging is still an active area of research to ensure that the produced images are of high quality and suitable for use in clinical settings.
arXiv Detail & Related papers (2023-06-03T06:33:33Z) - An Adversarial Active Sampling-based Data Augmentation Framework for
Manufacturable Chip Design [55.62660894625669]
Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable.
Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks.
We propose a litho-aware data augmentation framework to resolve the dilemma of limited data and improve the machine learning model performance.
arXiv Detail & Related papers (2022-10-27T20:53:39Z) - Generative Adversarial Networks for Data Generation in Structural Health
Monitoring [0.8250374560598496]
In AI, Machine Learning (ML) and Deep Learning (DL) algorithms require plenty of datasets to train.
In SHM applications, collecting data from civil structures through sensors is expensive and obtaining useful data (damage associated data) is challenging.
This paper shows that for the cases of insufficient data in DL or ML-based damage diagnostics, 1-D WDCGAN-GP can successfully generate data for the model to be trained on.
arXiv Detail & Related papers (2021-12-07T03:39:31Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z) - Generative Adversarial Networks (GANs): An Overview of Theoretical
Model, Evaluation Metrics, and Recent Developments [9.023847175654602]
Generative Adversarial Network (GAN) is an effective method to produce samples of large-scale data distribution.
GANs provide an appropriate way to learn deep representations without widespread use of labeled training data.
In GANs, the generative model is estimated via a competitive process where the generator and discriminator networks are trained simultaneously.
arXiv Detail & Related papers (2020-05-27T05:56:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.