ProvG-Searcher: A Graph Representation Learning Approach for Efficient Provenance Graph Search
- URL: http://arxiv.org/abs/2309.03647v2
- Date: Tue, 19 Dec 2023 12:24:31 GMT
- Title: ProvG-Searcher: A Graph Representation Learning Approach for Efficient Provenance Graph Search
- Authors: Enes Altinisik, Fatih Deniz, Husrev Taha Sencar,
- Abstract summary: We present ProvG-Searcher, a novel approach for detecting known APT behaviors within system security logs.
We formulate the task of searching provenance graphs as a subgraph matching problem and employ a graph representation learning method.
Experimental results on standard datasets demonstrate that ProvG-Searcher achieves superior performance, with an accuracy exceeding 99% in detecting query behaviors.
- Score: 13.627536649679577
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present ProvG-Searcher, a novel approach for detecting known APT behaviors within system security logs. Our approach leverages provenance graphs, a comprehensive graph representation of event logs, to capture and depict data provenance relations by mapping system entities as nodes and their interactions as edges. We formulate the task of searching provenance graphs as a subgraph matching problem and employ a graph representation learning method. The central component of our search methodology involves embedding of subgraphs in a vector space where subgraph relationships can be directly evaluated. We achieve this through the use of order embeddings that simplify subgraph matching to straightforward comparisons between a query and precomputed subgraph representations. To address challenges posed by the size and complexity of provenance graphs, we propose a graph partitioning scheme and a behavior-preserving graph reduction method. Overall, our technique offers significant computational efficiency, allowing most of the search computation to be performed offline while incorporating a lightweight comparison step during query execution. Experimental results on standard datasets demonstrate that ProvG-Searcher achieves superior performance, with an accuracy exceeding 99% in detecting query behaviors and a false positive rate of approximately 0.02%, outperforming other approaches.
Related papers
- The Graph Lottery Ticket Hypothesis: Finding Sparse, Informative Graph
Structure [18.00833762891405]
Graph Lottery Ticket (GLT) Hypothesis: There is an extremely sparse backbone for every graph.
We study 8 key metrics of interest that directly influence the performance of graph learning algorithms.
We propose a straightforward and efficient algorithm for finding these GLTs in arbitrary graphs.
arXiv Detail & Related papers (2023-12-08T00:24:44Z) - A Simple and Scalable Graph Neural Network for Large Directed Graphs [11.792826520370774]
We investigate various combinations of node representations and edge direction awareness within an input graph.
In response, we propose a simple yet holistic classification method A2DUG.
We demonstrate that A2DUG stably performs well on various datasets and improves the accuracy up to 11.29 compared with the state-of-the-art methods.
arXiv Detail & Related papers (2023-06-14T06:24:58Z) - Bures-Wasserstein Means of Graphs [60.42414991820453]
We propose a novel framework for defining a graph mean via embeddings in the space of smooth graph signal distributions.
By finding a mean in this embedding space, we can recover a mean graph that preserves structural information.
We establish the existence and uniqueness of the novel graph mean, and provide an iterative algorithm for computing it.
arXiv Detail & Related papers (2023-05-31T11:04:53Z) - Subgraph Matching via Query-Conditioned Subgraph Matching Neural
Networks and Bi-Level Tree Search [33.9052190473029]
Subgraph Matching is a core operation in graph database search, biomedical analysis, social group finding, etc.
In this paper, we propose a novel encoder-decoder neural network architecture to dynamically compute the matching information between the query and the target graphs.
Experiments on five large real-world target graphs show that N-BLS can significantly improve the subgraph matching performance.
arXiv Detail & Related papers (2022-07-21T04:47:21Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Interactive Visual Pattern Search on Graph Data via Graph Representation
Learning [20.795511688640296]
We propose a visual analytics system GraphQ to support human-in-the-loop, example-based, subgraph pattern search.
To support fast, interactive queries, we use graph neural networks (GNNs) to encode a graph as fixed-length latent vector representation.
We also propose a novel GNN for node-alignment called NeuroAlign to facilitate easy validation and interpretation of the query results.
arXiv Detail & Related papers (2022-02-18T22:30:28Z) - Reinforcement Learning Based Query Vertex Ordering Model for Subgraph
Matching [58.39970828272366]
Subgraph matching algorithms enumerate all is embeddings of a query graph in a data graph G.
matching order plays a critical role in time efficiency of these backtracking based subgraph matching algorithms.
In this paper, for the first time we apply the Reinforcement Learning (RL) and Graph Neural Networks (GNNs) techniques to generate the high-quality matching order for subgraph matching algorithms.
arXiv Detail & Related papers (2022-01-25T00:10:03Z) - Joint Graph Learning and Matching for Semantic Feature Correspondence [69.71998282148762]
We propose a joint emphgraph learning and matching network, named GLAM, to explore reliable graph structures for boosting graph matching.
The proposed method is evaluated on three popular visual matching benchmarks (Pascal VOC, Willow Object and SPair-71k)
It outperforms previous state-of-the-art graph matching methods by significant margins on all benchmarks.
arXiv Detail & Related papers (2021-09-01T08:24:02Z) - Learning the Implicit Semantic Representation on Graph-Structured Data [57.670106959061634]
Existing representation learning methods in graph convolutional networks are mainly designed by describing the neighborhood of each node as a perceptual whole.
We propose a Semantic Graph Convolutional Networks (SGCN) that explores the implicit semantics by learning latent semantic-paths in graphs.
arXiv Detail & Related papers (2021-01-16T16:18:43Z) - Graph Pooling with Node Proximity for Hierarchical Representation
Learning [80.62181998314547]
We propose a novel graph pooling strategy that leverages node proximity to improve the hierarchical representation learning of graph data with their multi-hop topology.
Results show that the proposed graph pooling strategy is able to achieve state-of-the-art performance on a collection of public graph classification benchmark datasets.
arXiv Detail & Related papers (2020-06-19T13:09:44Z) - Distance Metric Learning for Graph Structured Data [23.3700989820887]
We propose a supervised distance metric learning method for the graph classification problem.
Our method, named interpretable graph metric learning (IGML), learns discriminative metrics in a subgraph-based feature space.
We empirically evaluate the computational efficiency and classification performance of IGML on several benchmark datasets.
arXiv Detail & Related papers (2020-02-03T13:42:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.