Malware Classification Based on Image Segmentation
- URL: http://arxiv.org/abs/2406.03831v1
- Date: Thu, 6 Jun 2024 08:05:20 GMT
- Title: Malware Classification Based on Image Segmentation
- Authors: Wanhu Nie,
- Abstract summary: This paper proposes a novel approach for the visualization and classification of malware.
We segment the grayscale images generated from malware binary files based on the section categories.
These sub-images are then treated as multi-channel images and input into a deep convolutional neural network for malware classification.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Executable programs are highly structured files that can be recognized by operating systems and loaded into memory, analyzed for their dependencies, allocated resources, and ultimately executed. Each section of an executable program possesses distinct file and semantic boundaries, resembling puzzle pieces with varying shapes, textures, and sizes. These individualistic sections, when combined in diverse manners, constitute a complete executable program. This paper proposes a novel approach for the visualization and classification of malware. Specifically, we segment the grayscale images generated from malware binary files based on the section categories, resulting in multiple sub-images of different classes. These sub-images are then treated as multi-channel images and input into a deep convolutional neural network for malware classification. Experimental results demonstrate that images of different malware section classes exhibit favorable classification characteristics. Additionally, we discuss how the width alignment of malware grayscale images can influence the performance of the model.
Related papers
- Bayesian Unsupervised Disentanglement of Anatomy and Geometry for Deep Groupwise Image Registration [50.62725807357586]
This article presents a general Bayesian learning framework for multi-modal groupwise image registration.
We propose a novel hierarchical variational auto-encoding architecture to realise the inference procedure of the latent variables.
Experiments were conducted to validate the proposed framework, including four different datasets from cardiac, brain, and abdominal medical images.
arXiv Detail & Related papers (2024-01-04T08:46:39Z) - High-resolution Image-based Malware Classification using Multiple
Instance Learning [0.0]
This paper proposes a novel method of classifying malware into families using high-resolution greyscale images and multiple instance learning.
The implementation is evaluated on the Microsoft Malware Classification dataset and achieves accuracies of up to $96.6%$ on adversarially enlarged samples.
arXiv Detail & Related papers (2023-11-21T18:11:26Z) - Text Descriptions are Compressive and Invariant Representations for
Visual Learning [63.3464863723631]
We show that an alternative approach, in line with humans' understanding of multiple visual features per class, can provide compelling performance in the robust few-shot learning setting.
In particular, we introduce a novel method, textit SLR-AVD (Sparse Logistic Regression using Augmented Visual Descriptors).
This method first automatically generates multiple visual descriptions of each class via a large language model (LLM), then uses a VLM to translate these descriptions to a set of visual feature embeddings of each image, and finally uses sparse logistic regression to select a relevant subset of these features to classify
arXiv Detail & Related papers (2023-07-10T03:06:45Z) - M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical
Image Segmentation [73.10707675345253]
We propose a general multi-scale in multi-scale subtraction network (M$2$SNet) to finish diverse segmentation from medical image.
Our method performs favorably against most state-of-the-art methods under different evaluation metrics on eleven datasets of four different medical image segmentation tasks.
arXiv Detail & Related papers (2023-03-20T06:26:49Z) - Self-Supervised Correction Learning for Semi-Supervised Biomedical Image
Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation.
We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting.
Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z) - From Malware Samples to Fractal Images: A New Paradigm for
Classification. (Version 2.0, Previous version paper name: Have you ever seen
malware?) [0.3670422696827526]
We propose a very unconventional and novel approach to malware visualisation based on dynamic behaviour analysis.
The idea is that the images, which are visually very interesting, are then used to classify malware concerning goodware.
The results of the presented experiments are based on a database of 6 589 997 goodware, 827 853 potentially unwanted applications and 4 174 203 malware samples.
arXiv Detail & Related papers (2022-12-05T15:15:54Z) - Generative Adversarial Networks and Image-Based Malware Classification [7.803471587734353]
We focus on Generative Adversarial Networks (GAN) for multiclass classification.
We find that the AC-GAN discriminator is generally competitive with other machine learning techniques.
We also evaluate the utility of the GAN generative model for adversarial attacks on image-based malware detection.
arXiv Detail & Related papers (2022-06-08T20:59:47Z) - Virus-MNIST: Machine Learning Baseline Calculations for Image
Classification [0.0]
The Virus-MNIST data set is a collection of thumbnail images that is similar in style to the ubiquitous MNIST hand-written digits.
It is poised to take on a role in benchmarking progress of virus model training.
arXiv Detail & Related papers (2021-11-03T17:44:23Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Malware Detection Using Frequency Domain-Based Image Visualization and
Deep Learning [16.224649756613655]
We propose a novel method to detect and visualize malware through image classification.
The executable binaries are represented as grayscale images obtained from the count of N-grams (N=2) of bytes in the Discrete Cosine Transform domain.
A shallow neural network is trained for classification, and its accuracy is compared with deep-network architectures such as ResNet that are trained using transfer learning.
arXiv Detail & Related papers (2021-01-26T06:07:46Z) - Retinal Image Segmentation with a Structure-Texture Demixing Network [62.69128827622726]
The complex structure and texture information are mixed in a retinal image, and distinguishing the information is difficult.
Existing methods handle texture and structure jointly, which may lead biased models toward recognizing textures and thus results in inferior segmentation performance.
We propose a segmentation strategy that seeks to separate structure and texture components and significantly improve the performance.
arXiv Detail & Related papers (2020-07-15T12:19:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.