Related papers: Classifying Malware Using Function Representations in a Static Call Graph

Classifying Malware Using Function Representations in a Static Call Graph

URL: http://arxiv.org/abs/2012.01939v1
Date: Tue, 1 Dec 2020 20:36:19 GMT
Title: Classifying Malware Using Function Representations in a Static Call Graph
Authors: Thomas Dalton, Mauritius Schmidtler, Alireza Hadj Khodabakhshi
Abstract summary: We propose a deep learning approach for identifying malware families using the function call graphs of x86 assembly instructions. We test our approach by performing several experiments on a Microsoft malware classification data set and achieve excellent separation between malware families with a classification accuracy of 99.41%.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a deep learning approach for identifying malware families using the function call graphs of x86 assembly instructions. Though prior work on static call graph analysis exists, very little involves the application of modern, principled feature learning techniques to the problem. In this paper, we introduce a system utilizing an executable's function call graph where function representations are obtained by way of a recurrent neural network (RNN) autoencoder which maps sequences of x86 instructions into dense, latent vectors. These function embeddings are then modeled as vertices in a graph with edges indicating call dependencies. Capturing rich, node-level representations as well as global, topological properties of an executable file greatly improves malware family detection rates and contributes to a more principled approach to the problem in a way that deliberately avoids tedious feature engineering and domain expertise. We test our approach by performing several experiments on a Microsoft malware classification data set and achieve excellent separation between malware families with a classification accuracy of 99.41%.

Related papers

Know Your Neighborhood: General and Zero-Shot Capable Binary Function Search Powered by Call Graphlets [0.7646713951724013]
This paper proposes a novel graph neural network architecture combined with a novel graph data representation called call graphlets. A specialized graph neural network model operates on this graph representation, learning to map it to a feature vector that encodes semantic binary code similarities. Experimental results show that the combination of call graphlets and the novel graph neural network architecture achieves comparable or state-of-the-art performance.
arXiv Detail & Related papers (2024-06-02T18:26:50Z)
Bures-Wasserstein Means of Graphs [60.42414991820453]
We propose a novel framework for defining a graph mean via embeddings in the space of smooth graph signal distributions. By finding a mean in this embedding space, we can recover a mean graph that preserves structural information. We establish the existence and uniqueness of the novel graph mean, and provide an iterative algorithm for computing it.
arXiv Detail & Related papers (2023-05-31T11:04:53Z)
GIF: A General Graph Unlearning Strategy via Influence Function [63.52038638220563]
Graph Influence Function (GIF) is a model-agnostic unlearning method that can efficiently and accurately estimate parameter changes in response to a $epsilon$-mass perturbation in deleted data. We conduct extensive experiments on four representative GNN models and three benchmark datasets to justify GIF's superiority in terms of unlearning efficacy, model utility, and unlearning efficiency.
arXiv Detail & Related papers (2023-04-06T03:02:54Z)
A Comparison of Graph Neural Networks for Malware Classification [2.707154152696381]
We train a wide range of Graph Neural Network (GNN) architectures to generate embeddings which we then classify. We find that our best GNN models outperform previous comparable research involving the well-known MalNet-Tiny Android malware dataset.
arXiv Detail & Related papers (2023-03-22T01:05:57Z)
State of the Art and Potentialities of Graph-level Learning [54.68482109186052]
Graph-level learning has been applied to many tasks including comparison, regression, classification, and more. Traditional approaches to learning a set of graphs rely on hand-crafted features, such as substructures. Deep learning has helped graph-level learning adapt to the growing scale of graphs by extracting features automatically and encoding graphs into low-dimensional representations.
arXiv Detail & Related papers (2023-01-14T09:15:49Z)
Learning Heuristics for the Maximum Clique Enumeration Problem Using Low Dimensional Representations [0.0]
We use a learning framework for a pruning process of the input graph towards reducing the clique of the maximum enumeration problem. We study the role of using different vertex representations on the performance of this runtime method. We observe that using local graph features in the classification process produce more accurate results when combined with a feature elimination process.
arXiv Detail & Related papers (2022-10-30T22:04:32Z)
Malware Analysis with Symbolic Execution and Graph Kernel [2.1377923666134113]
We propose a new efficient open source toolchain for machine learning-based classification. We focus on the 1-dimensional Weisfeiler-Lehman kernel, which can capture local similarities between graphs.
arXiv Detail & Related papers (2022-04-12T08:52:33Z)
Joint Graph Learning and Matching for Semantic Feature Correspondence [69.71998282148762]
We propose a joint emphgraph learning and matching network, named GLAM, to explore reliable graph structures for boosting graph matching. The proposed method is evaluated on three popular visual matching benchmarks (Pascal VOC, Willow Object and SPair-71k) It outperforms previous state-of-the-art graph matching methods by significant margins on all benchmarks.
arXiv Detail & Related papers (2021-09-01T08:24:02Z)
Temporal Graph Network Embedding with Causal Anonymous Walks Representations [54.05212871508062]
We propose a novel approach for dynamic network representation learning based on Temporal Graph Network. For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings. We show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks.
arXiv Detail & Related papers (2021-08-19T15:39:52Z)
Time-varying Graph Representation Learning via Higher-Order Skip-Gram with Negative Sampling [0.456877715768796]
We build upon the fact that the skip-gram embedding approach implicitly performs a matrix factorization. We show that higher-order skip-gram with negative sampling is able to disentangle the role of nodes and time. We empirically evaluate our approach using time-resolved face-to-face proximity data, showing that the learned time-varying graph representations outperform state-of-the-art methods.
arXiv Detail & Related papers (2020-06-25T12:04:48Z)
Structural Temporal Graph Neural Networks for Anomaly Detection in Dynamic Graphs [54.13919050090926]
We propose an end-to-end structural temporal Graph Neural Network model for detecting anomalous edges in dynamic graphs. In particular, we first extract the $h$-hop enclosing subgraph centered on the target edge and propose the node labeling function to identify the role of each node in the subgraph. Based on the extracted features, we utilize Gated recurrent units (GRUs) to capture the temporal information for anomaly detection.
arXiv Detail & Related papers (2020-05-15T09:17:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.