Know Your Neighborhood: General and Zero-Shot Capable Binary Function Search Powered by Call Graphlets
- URL: http://arxiv.org/abs/2406.02606v2
- Date: Mon, 11 Nov 2024 21:40:16 GMT
- Title: Know Your Neighborhood: General and Zero-Shot Capable Binary Function Search Powered by Call Graphlets
- Authors: Joshua Collyer, Tim Watson, Iain Phillips,
- Abstract summary: This paper proposes a novel graph neural network architecture combined with a novel graph data representation called call graphlets.
A specialized graph neural network model operates on this graph representation, learning to map it to a feature vector that encodes semantic binary code similarities.
Experimental results show that the combination of call graphlets and the novel graph neural network architecture achieves comparable or state-of-the-art performance.
- Score: 0.7646713951724013
- License:
- Abstract: Binary code similarity detection is an important problem with applications in areas such as malware analysis, vulnerability research and license violation detection. This paper proposes a novel graph neural network architecture combined with a novel graph data representation called call graphlets. A call graphlet encodes the neighborhood around each function in a binary executable, capturing the local and global context through a series of statistical features. A specialized graph neural network model operates on this graph representation, learning to map it to a feature vector that encodes semantic binary code similarities using deep-metric learning. The proposed approach is evaluated across five distinct datasets covering different architectures, compiler tool chains, and optimization levels. Experimental results show that the combination of call graphlets and the novel graph neural network architecture achieves comparable or state-of-the-art performance compared to baseline techniques across cross-architecture, mono-architecture and zero shot tasks. In addition, our proposed approach also performs well when evaluated against an out-of-domain function inlining task. The work provides a general and effective graph neural network-based solution for conducting binary code similarity detection.
Related papers
- Graphcode: Learning from multiparameter persistent homology using graph neural networks [0.06138671548064355]
Graphcodes handle datasets that are filtered along two real-valued scale parameters.
Graphcodes yield an informative and interpretable summary.
They can be readily integrated in machine learning pipelines using graph neural networks.
arXiv Detail & Related papers (2024-05-23T08:22:00Z) - GNN-LoFI: a Novel Graph Neural Network through Localized Feature-based
Histogram Intersection [51.608147732998994]
Graph neural networks are increasingly becoming the framework of choice for graph-based machine learning.
We propose a new graph neural network architecture that substitutes classical message passing with an analysis of the local distribution of node features.
arXiv Detail & Related papers (2024-01-17T13:04:23Z) - Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding [51.75091298017941]
This paper proposes a novel Deep Manifold (Variational) Graph Auto-Encoder (DMVGAE/DMGAE) for attributed graph data.
The proposed method surpasses state-of-the-art baseline algorithms by a significant margin on different downstream tasks across popular datasets.
arXiv Detail & Related papers (2024-01-12T17:57:07Z) - UniG-Encoder: A Universal Feature Encoder for Graph and Hypergraph Node
Classification [6.977634174845066]
A universal feature encoder for both graph and hypergraph representation learning is designed, called UniG-Encoder.
The architecture starts with a forward transformation of the topological relationships of connected nodes into edge or hyperedge features.
The encoded node embeddings are then derived from the reversed transformation, described by the transpose of the projection matrix.
arXiv Detail & Related papers (2023-08-03T09:32:50Z) - Bures-Wasserstein Means of Graphs [60.42414991820453]
We propose a novel framework for defining a graph mean via embeddings in the space of smooth graph signal distributions.
By finding a mean in this embedding space, we can recover a mean graph that preserves structural information.
We establish the existence and uniqueness of the novel graph mean, and provide an iterative algorithm for computing it.
arXiv Detail & Related papers (2023-05-31T11:04:53Z) - BS-GAT Behavior Similarity Based Graph Attention Network for Network
Intrusion Detection [20.287285893803244]
This paper proposes a graph neural network algorithm based on behavior similarity (BS-GAT) using graph attention network.
The results show that the proposed method is effective and has superior performance comparing to existing solutions.
arXiv Detail & Related papers (2023-04-07T09:42:07Z) - CGMN: A Contrastive Graph Matching Network for Self-Supervised Graph
Similarity Learning [65.1042892570989]
We propose a contrastive graph matching network (CGMN) for self-supervised graph similarity learning.
We employ two strategies, namely cross-view interaction and cross-graph interaction, for effective node representation learning.
We transform node representations into graph-level representations via pooling operations for graph similarity computation.
arXiv Detail & Related papers (2022-05-30T13:20:26Z) - Temporal Graph Network Embedding with Causal Anonymous Walks
Representations [54.05212871508062]
We propose a novel approach for dynamic network representation learning based on Temporal Graph Network.
For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings.
We show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks.
arXiv Detail & Related papers (2021-08-19T15:39:52Z) - Co-embedding of Nodes and Edges with Graph Neural Networks [13.020745622327894]
Graph embedding is a way to transform and encode the data structure in high dimensional and non-Euclidean feature space.
CensNet is a general graph embedding framework, which embeds both nodes and edges to a latent feature space.
Our approach achieves or matches the state-of-the-art performance in four graph learning tasks.
arXiv Detail & Related papers (2020-10-25T22:39:31Z) - Graph Pooling with Node Proximity for Hierarchical Representation
Learning [80.62181998314547]
We propose a novel graph pooling strategy that leverages node proximity to improve the hierarchical representation learning of graph data with their multi-hop topology.
Results show that the proposed graph pooling strategy is able to achieve state-of-the-art performance on a collection of public graph classification benchmark datasets.
arXiv Detail & Related papers (2020-06-19T13:09:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.