Malware Analysis with Artificial Intelligence and a Particular Attention
on Results Interpretability
- URL: http://arxiv.org/abs/2107.11100v1
- Date: Fri, 23 Jul 2021 09:40:05 GMT
- Title: Malware Analysis with Artificial Intelligence and a Particular Attention
on Results Interpretability
- Authors: Benjamin Marais, Tony Quertier, Christophe Chesneau
- Abstract summary: We propose a model based on the transformation of binary files into grayscale image, which achieves an accuracy rate of 88%.
It can determine if a sample is packed or encrypted with a precision of 85%.
This kind of tool should be very useful for data analysts, it compensates for the lack of interpretability of the common detection models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Malware detection and analysis are active research subjects in cybersecurity
over the last years. Indeed, the development of obfuscation techniques, as
packing, for example, requires special attention to detect recent variants of
malware. The usual detection methods do not necessarily provide tools to
interpret the results. Therefore, we propose a model based on the
transformation of binary files into grayscale image, which achieves an accuracy
rate of 88%. Furthermore, the proposed model can determine if a sample is
packed or encrypted with a precision of 85%. It allows us to analyze results
and act appropriately. Also, by applying attention mechanisms on detection
models, we have the possibility to identify which part of the files looks
suspicious. This kind of tool should be very useful for data analysts, it
compensates for the lack of interpretability of the common detection models,
and it can help to understand why some malicious files are undetected.
Related papers
- Semantic Preprocessing for LLM-based Malware Analysis [0.0]
We propose a new preprocessing method which creates reports for Portable Executable files.<n>The purpose of this preprocessing is to gather a semantic representation of binary files, understandable by malware analysts.<n>Using this preprocessing to train a Large Language Model for Malware classification, we achieve a weighted-average F1-score of 0.94 on a complex dataset.
arXiv Detail & Related papers (2025-06-13T13:39:00Z) - Enhanced Consistency Bi-directional GAN(CBiGAN) for Malware Anomaly Detection [0.25163931116642785]
This paper introduces the application of the CBiGAN in the domain of malware anomaly detection.<n>We utilize several datasets including both portable executable (PE) files as well as Object Linking and Embedding (OLE) files.<n>We then evaluated our model against a diverse set of both PE and OLE files, including self-collected malicious executables from 214 malware families.
arXiv Detail & Related papers (2025-06-09T02:43:25Z) - DWFS-Obfuscation: Dynamic Weighted Feature Selection for Robust Malware Familial Classification under Obfuscation [1.4742348415878401]
We propose a dynamic weighted feature selection method that analyzes the importance and stability of features.
We then utilize graph neural networks for classification, thereby improving the robustness and accuracy of the detection system.
Experiments demonstrate that our proposed method achieves an F1-score of 95.56% on the unobfuscated dataset and 92.28% on the obfuscated dataset.
arXiv Detail & Related papers (2025-04-10T09:37:43Z) - The Road Less Traveled: Investigating Robustness and Explainability in CNN Malware Detection [15.43868945929965]
We integrate quantitative analysis with explainability tools to better understand CNN behavior in malware classification.
obfuscation techniques can reduce model accuracy by up to 50%, and propose a mitigation strategy to enhance robustness.
This work contributes to improving the interpretability and resilience of deep learning-based intrusion detection systems.
arXiv Detail & Related papers (2025-03-03T10:42:00Z) - Unveiling Malware Patterns: A Self-analysis Perspective [15.517313565392852]
VisUnpack is a static analysis-based data visualization framework for bolstering attack prevention and aiding recovery post-attack.
Our method includes unpacking packed malware programs, calculating local similarity descriptors based on basic blocks, enhancing correlations between descriptors, and refining them by minimizing noises.
Our comprehensive evaluation of VisUnpack based on a freshly gathered dataset with over 27,106 samples confirms its capability in accurately classifying malware programs with a precision of 99.7%.
arXiv Detail & Related papers (2025-01-10T16:04:13Z) - MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware.
We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph.
This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z) - Detecting new obfuscated malware variants: A lightweight and interpretable machine learning approach [0.0]
We present a machine learning-based system for detecting obfuscated malware that is highly accurate, lightweight and interpretable.
Our system is capable of detecting 15 malware subtypes despite being exclusively trained on one malware subtype, namely the Transponder from the Spyware family.
The Transponder-focused model exhibited high accuracy, exceeding 99.8%, with an average processing speed of 5.7 microseconds per file.
arXiv Detail & Related papers (2024-07-07T12:41:40Z) - Bayesian Learned Models Can Detect Adversarial Malware For Free [28.498994871579985]
Adversarial training is an effective method but is computationally expensive to scale up to large datasets.
In particular, a Bayesian formulation can capture the model parameters' distribution and quantify uncertainty without sacrificing model performance.
We found, quantifying uncertainty through Bayesian learning methods can defend against adversarial malware.
arXiv Detail & Related papers (2024-03-27T07:16:48Z) - Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits! [51.668411293817464]
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines.
Academic research is often restrained to public datasets on the order of ten thousand samples.
We devise an approach to generate a benchmark of difficulty from a pool of available samples.
arXiv Detail & Related papers (2023-12-25T21:25:55Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Burning the Adversarial Bridges: Robust Windows Malware Detection
Against Binary-level Mutations [16.267773730329207]
We conduct root-cause analyses of the practical binary-level black-box adversarial malware examples.
We highlight volatile information channels within the software and introduce three software pre-processing steps to eliminate the attack surface.
To counter the emerging section injection attacks, we propose a graph-based section-dependent information extraction scheme.
arXiv Detail & Related papers (2023-10-05T03:28:02Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Towards a Fair Comparison and Realistic Design and Evaluation Framework
of Android Malware Detectors [63.75363908696257]
We analyze 10 influential research works on Android malware detection using a common evaluation framework.
We identify five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models.
We conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results.
arXiv Detail & Related papers (2022-05-25T08:28:08Z) - Mate! Are You Really Aware? An Explainability-Guided Testing Framework
for Robustness of Malware Detectors [49.34155921877441]
We propose an explainability-guided and model-agnostic testing framework for robustness of malware detectors.
We then use this framework to test several state-of-the-art malware detectors' abilities to detect manipulated malware.
Our findings shed light on the limitations of current malware detectors, as well as how they can be improved.
arXiv Detail & Related papers (2021-11-19T08:02:38Z) - Malware Detection Using Frequency Domain-Based Image Visualization and
Deep Learning [16.224649756613655]
We propose a novel method to detect and visualize malware through image classification.
The executable binaries are represented as grayscale images obtained from the count of N-grams (N=2) of bytes in the Discrete Cosine Transform domain.
A shallow neural network is trained for classification, and its accuracy is compared with deep-network architectures such as ResNet that are trained using transfer learning.
arXiv Detail & Related papers (2021-01-26T06:07:46Z) - Being Single Has Benefits. Instance Poisoning to Deceive Malware
Classifiers [47.828297621738265]
We show how an attacker can launch a sophisticated and efficient poisoning attack targeting the dataset used to train a malware classifier.
As opposed to other poisoning attacks in the malware detection domain, our attack does not focus on malware families but rather on specific malware instances that contain an implanted trigger.
We propose a comprehensive detection approach that could serve as a future sophisticated defense against this newly discovered severe threat.
arXiv Detail & Related papers (2020-10-30T15:27:44Z) - MDEA: Malware Detection with Evolutionary Adversarial Learning [16.8615211682877]
MDEA, an Adversarial Malware Detection model uses evolutionary optimization to create attack samples to make the network robust against evasion attacks.
By retraining the model with the evolved malware samples, its performance improves a significant margin.
arXiv Detail & Related papers (2020-02-09T09:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.