Classifying Malware Images with Convolutional Neural Network Models
- URL: http://arxiv.org/abs/2010.16108v1
- Date: Fri, 30 Oct 2020 07:39:30 GMT
- Title: Classifying Malware Images with Convolutional Neural Network Models
- Authors: Ahmed Bensaoud, Nawaf Abudawaood, Jugal Kalita
- Abstract summary: In this paper, we use several convolutional neural network (CNN) models for static malware classification.
The Inception V3 model achieves a test accuracy of 99.24%, which is better than the accuracy of 98.52% achieved by the current state-of-the-art system.
- Score: 2.363388546004777
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to increasing threats from malicious software (malware) in both number
and complexity, researchers have developed approaches to automatic detection
and classification of malware, instead of analyzing methods for malware files
manually in a time-consuming effort. At the same time, malware authors have
developed techniques to evade signature-based detection techniques used by
antivirus companies. Most recently, deep learning is being used in malware
classification to solve this issue. In this paper, we use several convolutional
neural network (CNN) models for static malware classification. In particular,
we use six deep learning models, three of which are past winners of the
ImageNet Large-Scale Visual Recognition Challenge. The other three models are
CNN-SVM, GRU-SVM and MLP-SVM, which enhance neural models with support vector
machines (SVM). We perform experiments using the Malimg dataset, which has
malware images that were converted from Portable Executable malware binaries.
The dataset is divided into 25 malware families. Comparisons show that the
Inception V3 model achieves a test accuracy of 99.24%, which is better than the
accuracy of 98.52% achieved by the current state-of-the-art system called the
M-CNN model.
Related papers
- Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits! [51.668411293817464]
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines.
Academic research is often restrained to public datasets on the order of ten thousand samples.
We devise an approach to generate a benchmark of difficulty from a pool of available samples.
arXiv Detail & Related papers (2023-12-25T21:25:55Z) - EMBERSim: A Large-Scale Databank for Boosting Similarity Search in
Malware Analysis [48.5877840394508]
In recent years there has been a shift from quantifications-based malware detection towards machine learning.
We propose to address the deficiencies in the space of similarity research on binary files, starting from EMBER.
We enhance EMBER with similarity information as well as malware class tags, to enable further research in the similarity space.
arXiv Detail & Related papers (2023-10-03T06:58:45Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Self-Supervised Vision Transformers for Malware Detection [0.0]
This paper presents SHERLOCK, a self-supervision based deep learning model to detect malware based on the Vision Transformer (ViT) architecture.
Our proposed model is also able to outperform state-of-the-art techniques for multi-class malware classification of types and family with macro-F1 score of.497 and.491 respectively.
arXiv Detail & Related papers (2022-08-15T07:49:58Z) - Task-Aware Meta Learning-based Siamese Neural Network for Classifying
Obfuscated Malware [5.293553970082943]
Existing malware detection methods fail to correctly classify different malware families when obfuscated malware samples are present in the training dataset.
We propose a novel task-aware few-shot-learning-based Siamese Neural Network that is resilient against such control flow obfuscation techniques.
Our proposed approach is highly effective in recognizing unique malware signatures, thus correctly classifying malware samples that belong to the same malware family.
arXiv Detail & Related papers (2021-10-26T04:44:13Z) - Malware Classification Using Long Short-Term Memory Models [6.961253535504979]
We create four different long-short term memory (LSTM) based models and train each to classify malware samples from 20 families.
We employ techniques used in natural language processing (NLP), including word embedding and bidirection LSTMs.
We find that a model consisting of word embedding, biLSTMs, and CNN layers performs best in our malware classification experiments.
arXiv Detail & Related papers (2021-03-03T23:14:03Z) - Adversarially robust deepfake media detection using fused convolutional
neural network predictions [79.00202519223662]
Current deepfake detection systems struggle against unseen data.
We employ three different deep Convolutional Neural Network (CNN) models to classify fake and real images extracted from videos.
The proposed technique outperforms state-of-the-art models with 96.5% accuracy.
arXiv Detail & Related papers (2021-02-11T11:28:00Z) - Malware Detection Using Frequency Domain-Based Image Visualization and
Deep Learning [16.224649756613655]
We propose a novel method to detect and visualize malware through image classification.
The executable binaries are represented as grayscale images obtained from the count of N-grams (N=2) of bytes in the Discrete Cosine Transform domain.
A shallow neural network is trained for classification, and its accuracy is compared with deep-network architectures such as ResNet that are trained using transfer learning.
arXiv Detail & Related papers (2021-01-26T06:07:46Z) - Being Single Has Benefits. Instance Poisoning to Deceive Malware
Classifiers [47.828297621738265]
We show how an attacker can launch a sophisticated and efficient poisoning attack targeting the dataset used to train a malware classifier.
As opposed to other poisoning attacks in the malware detection domain, our attack does not focus on malware families but rather on specific malware instances that contain an implanted trigger.
We propose a comprehensive detection approach that could serve as a future sophisticated defense against this newly discovered severe threat.
arXiv Detail & Related papers (2020-10-30T15:27:44Z) - Data Augmentation Based Malware Detection using Convolutional Neural
Networks [0.0]
Cyber-attacks have been extensively seen due to the increase of malware in the cyber world.
The most important feature of this type of malware is that they change shape as they propagate from one computer to another.
This paper aims at providing an image augmentation enhanced deep convolutional neural network models for the detection of malware families in a metamorphic malware environment.
arXiv Detail & Related papers (2020-10-05T08:58:07Z) - Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch.
We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types.
In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.