Related papers: A Novel Approach to Malicious Code Detection Using CNN-BiLSTM and Feature Fusion

A Novel Approach to Malicious Code Detection Using CNN-BiLSTM and Feature Fusion

URL: http://arxiv.org/abs/2410.09401v1
Date: Sat, 12 Oct 2024 07:10:44 GMT
Title: A Novel Approach to Malicious Code Detection Using CNN-BiLSTM and Feature Fusion
Authors: Lixia Zhang, Tianxu Liu, Kaihui Shen, Cheng Chen,
Abstract summary: This study employs the minhash algorithm to convert binary files of malware into grayscale images. The study utilizes IDA Pro to decompile and extract opcode sequences, applying N-gram and tf-idf algorithms for feature vectorization. A CNN-BiLSTM fusion model is designed to simultaneously process image features and opcode sequences, enhancing classification performance.
Score: 2.3039261241391586
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rapid advancement of Internet technology, the threat of malware to computer systems and network security has intensified. Malware affects individual privacy and security and poses risks to critical infrastructures of enterprises and nations. The increasing quantity and complexity of malware, along with its concealment and diversity, challenge traditional detection techniques. Static detection methods struggle against variants and packed malware, while dynamic methods face high costs and risks that limit their application. Consequently, there is an urgent need for novel and efficient malware detection techniques to improve accuracy and robustness. This study first employs the minhash algorithm to convert binary files of malware into grayscale images, followed by the extraction of global and local texture features using GIST and LBP algorithms. Additionally, the study utilizes IDA Pro to decompile and extract opcode sequences, applying N-gram and tf-idf algorithms for feature vectorization. The fusion of these features enables the model to comprehensively capture the behavioral characteristics of malware. In terms of model construction, a CNN-BiLSTM fusion model is designed to simultaneously process image features and opcode sequences, enhancing classification performance. Experimental validation on multiple public datasets demonstrates that the proposed method significantly outperforms traditional detection techniques in terms of accuracy, recall, and F1 score, particularly in detecting variants and obfuscated malware with greater stability. The research presented in this paper offers new insights into the development of malware detection technologies, validating the effectiveness of feature and model fusion, and holds promising application prospects.

Related papers

CodeVision: Detecting LLM-Generated Code Using 2D Token Probability Maps and Vision Models [28.711745671275477]
The rise of large language models (LLMs) has significantly improved automated code generation, enhancing software development efficiency. Existing detection methods, such as pre-trained models and watermarking, face limitations in adaptability and computational efficiency. We propose a novel detection method using 2D token probability maps combined with vision models, preserving spatial code structures.
arXiv Detail & Related papers (2025-01-06T06:15:10Z)
Predicting Vulnerability to Malware Using Machine Learning Models: A Study on Microsoft Windows Machines [0.0]
This study addresses the need for effective malware detection strategies by leveraging Machine Learning (ML) techniques. Our research aims to develop an advanced ML model that accurately predicts malware vulnerabilities based on the specific conditions of individual machines.
arXiv Detail & Related papers (2025-01-05T10:04:58Z)
StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model [62.25424831998405]
StealthDiffusion is a framework that modifies AI-generated images into high-quality, imperceptible adversarial examples. It is effective in both white-box and black-box settings, transforming AI-generated images into high-quality adversarial forgeries.
arXiv Detail & Related papers (2024-08-11T01:22:29Z)
Deep Learning Fusion For Effective Malware Detection: Leveraging Visual Features [12.431734971186673]
We investigate the power of fusing Convolutional Neural Network models trained on different modalities of a malware executable. We are proposing a novel multimodal fusion algorithm, leveraging three different visual malware features. The proposed strategy has a detection rate of 1.00 (on a scale of 0-1) in identifying malware in the given dataset.
arXiv Detail & Related papers (2024-05-23T08:32:40Z)
Leveraging LSTM and GAN for Modern Malware Detection [0.4799822253865054]
This paper proposes the utilization of the Deep Learning Model, LSTM networks, and GAN classifiers to amplify malware detection accuracy and speed. The research outcomes come out with 98% accuracy that shows the efficiency of deep learning plays a decisive role in proactive cybersecurity defense.
arXiv Detail & Related papers (2024-05-07T14:57:24Z)
Bayesian Learned Models Can Detect Adversarial Malware For Free [28.498994871579985]
Adversarial training is an effective method but is computationally expensive to scale up to large datasets. In particular, a Bayesian formulation can capture the model parameters' distribution and quantify uncertainty without sacrificing model performance. We found, quantifying uncertainty through Bayesian learning methods can defend against adversarial malware.
arXiv Detail & Related papers (2024-03-27T07:16:48Z)
Do You Trust Your Model? Emerging Malware Threats in the Deep Learning Ecosystem [37.650342256199096]
We introduce MaleficNet 2.0, a technique to embed self-extracting, self-executing malware in neural networks. MaleficNet 2.0 injection technique is stealthy, does not degrade the performance of the model, and is robust against removal techniques. We implement a proof-of-concept self-extracting neural network malware using MaleficNet 2.0, demonstrating the practicality of the attack against a widely adopted machine learning framework.
arXiv Detail & Related papers (2024-03-06T10:27:08Z)
Comprehensive evaluation of Mal-API-2019 dataset by machine learning in malware detection [0.5475886285082937]
This study conducts a thorough examination of malware detection using machine learning techniques. The aim is to advance cybersecurity capabilities by identifying and mitigating threats more effectively.
arXiv Detail & Related papers (2024-03-04T17:22:43Z)
Discovering Malicious Signatures in Software from Structural Interactions [7.06449725392051]
We propose a novel malware detection approach that leverages deep learning, mathematical techniques, and network science. Our approach focuses on static and dynamic analysis and utilizes the Low-Level Virtual Machine (LLVM) to profile applications within a complex network. Our approach marks a substantial improvement in malware detection, providing a notably more accurate and efficient solution.
arXiv Detail & Related papers (2023-12-19T23:42:20Z)
Using Machine Learning To Identify Software Weaknesses From Software Requirement Specifications [49.1574468325115]
This research focuses on finding an efficient machine learning algorithm to identify software weaknesses from requirement specifications. Keywords extracted using latent semantic analysis help map the CWE categories to PROMISE_exp. Naive Bayes, support vector machine (SVM), decision trees, neural network, and convolutional neural network (CNN) algorithms were tested.
arXiv Detail & Related papers (2023-08-10T13:19:10Z)
A survey on hardware-based malware detection approaches [45.24207460381396]
Hardware-based malware detection approaches leverage hardware performance counters and machine learning prowess. We meticulously analyze the approach, unraveling the most common methods, algorithms, tools, and datasets that shape its contours. The discussion extends to crafting mixed hardware and software approaches for collaborative efficacy, essential enhancements in hardware monitoring units, and a better understanding of the correlation between hardware events and malware applications.
arXiv Detail & Related papers (2023-03-22T13:00:41Z)
Mal2GCN: A Robust Malware Detection Approach Using Deep Graph Convolutional Networks With Non-Negative Weights [1.3190581566723918]
We present a black-box source code-based adversarial malware generation approach that can be used to evaluate the robustness of malware detection models against real-world adversaries. We then propose Mal2GCN, a robust malware detection model. Mal2GCN uses the representation power of graph convolutional networks combined with the non-negative weights training method to create a malware detection model with high detection accuracy.
arXiv Detail & Related papers (2021-08-27T19:42:13Z)
Evading Malware Classifiers via Monte Carlo Mutant Feature Discovery [23.294653273180472]
We show how a malicious actor trains a surrogate model to discover binary mutations that cause an instance to be misclassified. Then, mutated malware is sent to the victim model that takes the place of an antivirus API to test whether it can evade detection.
arXiv Detail & Related papers (2021-06-15T03:31:02Z)
Anomaly Detection Based on Selection and Weighting in Latent Space [73.01328671569759]
We propose a novel selection-and-weighting-based anomaly detection framework called SWAD. Experiments on both benchmark and real-world datasets have shown the effectiveness and superiority of SWAD.
arXiv Detail & Related papers (2021-03-08T10:56:38Z)
Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs. Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.