Enhanced Consistency Bi-directional GAN(CBiGAN) for Malware Anomaly Detection
- URL: http://arxiv.org/abs/2506.07372v1
- Date: Mon, 09 Jun 2025 02:43:25 GMT
- Title: Enhanced Consistency Bi-directional GAN(CBiGAN) for Malware Anomaly Detection
- Authors: Thesath Wijayasiri, Kar Wai Fok, Vrizlynn L. L. Thing,
- Abstract summary: This paper introduces the application of the CBiGAN in the domain of malware anomaly detection.<n>We utilize several datasets including both portable executable (PE) files as well as Object Linking and Embedding (OLE) files.<n>We then evaluated our model against a diverse set of both PE and OLE files, including self-collected malicious executables from 214 malware families.
- Score: 0.25163931116642785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Static analysis, a cornerstone technique in cybersecurity, offers a noninvasive method for detecting malware by analyzing dormant software without executing potentially harmful code. However, traditional static analysis often relies on biased or outdated datasets, leading to gaps in detection capabilities against emerging malware threats. To address this, our study focuses on the binary content of files as key features for malware detection. These binary contents are transformed and represented as images, which then serve as inputs to deep learning models. This method takes into account the visual patterns within the binary data, allowing the model to analyze potential malware effectively. This paper introduces the application of the CBiGAN in the domain of malware anomaly detection. Our approach leverages the CBiGAN for its superior latent space mapping capabilities, critical for modeling complex malware patterns by utilizing a reconstruction error-based anomaly detection method. We utilized several datasets including both portable executable (PE) files as well as Object Linking and Embedding (OLE) files. We then evaluated our model against a diverse set of both PE and OLE files, including self-collected malicious executables from 214 malware families. Our findings demonstrate the robustness of this innovative approach, with the CBiGAN achieving high Area Under the Curve (AUC) results with good generalizability, thereby confirming its capability to distinguish between benign and diverse malicious files with reasonably high accuracy.
Related papers
- Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - A Novel Approach to Malicious Code Detection Using CNN-BiLSTM and Feature Fusion [2.3039261241391586]
This study employs the minhash algorithm to convert binary files of malware into grayscale images.
The study utilizes IDA Pro to decompile and extract opcode sequences, applying N-gram and tf-idf algorithms for feature vectorization.
A CNN-BiLSTM fusion model is designed to simultaneously process image features and opcode sequences, enhancing classification performance.
arXiv Detail & Related papers (2024-10-12T07:10:44Z) - Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models [65.30406788716104]
This work investigates the vulnerabilities of security-enhancing diffusion models.
We demonstrate that these models are highly susceptible to DIFF2, a simple yet effective backdoor attack.
Case studies show that DIFF2 can significantly reduce both post-purification and certified accuracy across benchmark datasets and models.
arXiv Detail & Related papers (2024-06-14T02:39:43Z) - EMBERSim: A Large-Scale Databank for Boosting Similarity Search in
Malware Analysis [48.5877840394508]
In recent years there has been a shift from quantifications-based malware detection towards machine learning.
We propose to address the deficiencies in the space of similarity research on binary files, starting from EMBER.
We enhance EMBER with similarity information as well as malware class tags, to enable further research in the similarity space.
arXiv Detail & Related papers (2023-10-03T06:58:45Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Self-Supervised Vision Transformers for Malware Detection [0.0]
This paper presents SHERLOCK, a self-supervision based deep learning model to detect malware based on the Vision Transformer (ViT) architecture.
Our proposed model is also able to outperform state-of-the-art techniques for multi-class malware classification of types and family with macro-F1 score of.497 and.491 respectively.
arXiv Detail & Related papers (2022-08-15T07:49:58Z) - Mate! Are You Really Aware? An Explainability-Guided Testing Framework
for Robustness of Malware Detectors [49.34155921877441]
We propose an explainability-guided and model-agnostic testing framework for robustness of malware detectors.
We then use this framework to test several state-of-the-art malware detectors' abilities to detect manipulated malware.
Our findings shed light on the limitations of current malware detectors, as well as how they can be improved.
arXiv Detail & Related papers (2021-11-19T08:02:38Z) - HAPSSA: Holistic Approach to PDF Malware Detection Using Signal and
Statistical Analysis [16.224649756613655]
Malicious PDF documents present a serious threat to various security organizations.
State-of-the-art approaches use machine learning (ML) to learn features that characterize PDF malware.
In this paper, we derive a simple yet effective holistic approach to PDF malware detection.
arXiv Detail & Related papers (2021-11-08T18:32:47Z) - Malware Analysis with Artificial Intelligence and a Particular Attention
on Results Interpretability [0.0]
We propose a model based on the transformation of binary files into grayscale image, which achieves an accuracy rate of 88%.
It can determine if a sample is packed or encrypted with a precision of 85%.
This kind of tool should be very useful for data analysts, it compensates for the lack of interpretability of the common detection models.
arXiv Detail & Related papers (2021-07-23T09:40:05Z) - Towards an Automated Pipeline for Detecting and Classifying Malware
through Machine Learning [0.0]
We propose a malware taxonomic classification pipeline able to classify Windows Portable Executable files (PEs)
Given an input PE sample, it is first classified as either malicious or benign.
If malicious, the pipeline further analyzes it in order to establish its threat type, family, and behavior(s)
arXiv Detail & Related papers (2021-06-10T10:07:50Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.