Scalable APT Malware Classification via Parallel Feature Extraction and GPU-Accelerated Learning
- URL: http://arxiv.org/abs/2504.15497v1
- Date: Tue, 22 Apr 2025 00:05:05 GMT
- Title: Scalable APT Malware Classification via Parallel Feature Extraction and GPU-Accelerated Learning
- Authors: Noah Subedar, Taeui Kim, Saathwick Venkataramalingam,
- Abstract summary: This paper presents a framework for mapping malicious executables to known Persistent Advanced Threat (APT) groups.<n>The main feature of this analysis is the assembly-level instructions present in executables which are also known as opcodes.<n>Traditional and deep learning models are applied to create models capable of classifying malware samples.
- Score: 0.3277163122167433
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents an underlying framework for both automating and accelerating malware classification, more specifically, mapping malicious executables to known Advanced Persistent Threat (APT) groups. The main feature of this analysis is the assembly-level instructions present in executables which are also known as opcodes. The collection of such opcodes on many malicious samples is a lengthy process; hence, open-source reverse engineering tools are used in tandem with scripts that leverage parallel computing to analyze multiple files at once. Traditional and deep learning models are applied to create models capable of classifying malware samples. One-gram and two-gram datasets are constructed and used to train models such as SVM, KNN, and Decision Tree; however, they struggle to provide adequate results without relying on metadata to support n-gram sequences. The computational limitations of such models are overcome with convolutional neural networks (CNNs) and heavily accelerated using graphical compute unit (GPU) resources.
Related papers
- Reviving Any-Subset Autoregressive Models with Principled Parallel Sampling and Speculative Decoding [55.2480439325792]
In arbitrary-order language models, it is an open question how to sample tokens in parallel from the correct joint distribution.
We find that a different class of models, any-subset autoregressive models (AS-ARMs), holds the solution.
We show that AS-ARMs achieve state-of-the-art performance among sub-200M parameter models on infilling benchmark tasks, and nearly match the performance of models 50X larger on code generation.
arXiv Detail & Related papers (2025-04-29T06:33:13Z) - OpCode-Based Malware Classification Using Machine Learning and Deep Learning Techniques [0.0]
This report presents a comprehensive analysis of malware classification using OpCode sequences.<n>Two distinct approaches are evaluated: traditional machine learning using n-gram analysis with Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Decision Tree classifiers; and a deep learning approach employing a Convolutional Neural Network (CNN)
arXiv Detail & Related papers (2025-04-18T02:09:57Z) - RGCVAE: Relational Graph Conditioned Variational Autoencoder for
Molecule Design [70.59828655929194]
Deep Graph Variational Autoencoders are among the most powerful machine learning tools with which it is possible to address this problem.
We propose RGCVAE, an efficient and effective Graph Variational Autoencoder based on: (i) an encoding network exploiting a new powerful Graph Isomorphism Network; (ii) a novel probabilistic decoding component.
arXiv Detail & Related papers (2023-05-19T14:23:48Z) - SWARM Parallelism: Training Large Models Can Be Surprisingly
Communication-Efficient [69.61083127540776]
Deep learning applications benefit from using large models with billions of parameters.
Training these models is notoriously expensive due to the need for specialized HPC clusters.
We consider alternative setups for training large models: using cheap "preemptible" instances or pooling existing resources from multiple regions.
arXiv Detail & Related papers (2023-01-27T18:55:19Z) - Quo Vadis: Hybrid Machine Learning Meta-Model based on Contextual and Behavioral Malware Representations [5.439020425819001]
We propose a hybrid machine learning architecture that simultaneously employs multiple deep learning models.
We report an improved detection rate, above the capabilities of the current state-of-the-art model.
arXiv Detail & Related papers (2022-08-20T05:30:16Z) - Benchmark Assessment for DeepSpeed Optimization Library [1.7839986996686321]
Deep Learning (DL) models are widely used in machine learning due to their performance and ability to deal with large datasets.
The size of such datasets and the complexity of DL models cause such models to be complex, consuming large amount of resources and time to train.
Many recent libraries and applications are introduced to deal with DL complexity and efficiency issues.
arXiv Detail & Related papers (2022-02-12T04:52:28Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - CREPO: An Open Repository to Benchmark Credal Network Algorithms [78.79752265884109]
Credal networks are imprecise probabilistic graphical models based on, so-called credal, sets of probability mass functions.
A Java library called CREMA has been recently released to model, process and query credal networks.
We present CREPO, an open repository of synthetic credal networks, provided together with the exact results of inference tasks on these models.
arXiv Detail & Related papers (2021-05-10T07:31:59Z) - CNN vs ELM for Image-Based Malware Classification [3.4806267677524896]
We train and evaluate machine learning models for malware classification, based on features that can be obtained without disassembly or execution of code.
We find that ELMs can achieve accuracies on par with CNNs, yet ELM training requires less than2% of the time needed to train a comparable CNN.
arXiv Detail & Related papers (2021-03-24T00:51:06Z) - MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down
Distillation [153.56211546576978]
In this work, we propose that better soft targets with higher compatibil-ity can be generated by using a label generator.
We can employ the meta-learning technique to optimize this label generator.
The experiments are conducted on two standard classificationbenchmarks, namely CIFAR-100 and ILSVRC2012.
arXiv Detail & Related papers (2020-08-27T13:04:27Z) - Analyzing Knowledge Graph Embedding Methods from a Multi-Embedding
Interaction Perspective [3.718476964451589]
Real-world knowledge graphs are usually incomplete, so knowledge graph embedding methods have been proposed to address this issue.
These methods represent entities and relations as embedding vectors in semantic space and predict the links between them.
We propose a new multi-embedding model based on quaternion algebra and show that it achieves promising results using popular benchmarks.
arXiv Detail & Related papers (2019-03-27T13:09:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.