Marlin: Knowledge-Driven Analysis of Provenance Graphs for Efficient and Robust Detection of Cyber Attacks
- URL: http://arxiv.org/abs/2403.12541v2
- Date: Wed, 10 Jul 2024 05:49:48 GMT
- Title: Marlin: Knowledge-Driven Analysis of Provenance Graphs for Efficient and Robust Detection of Cyber Attacks
- Authors: Zhenyuan Li, Yangyang Wei, Xiangmin Shen, Lingzhi Wang, Yan Chen, Haitao Xu, Shouling Ji, Fan Zhang, Liang Hou, Wenmao Liu, Xuhong Zhang, Jianwei Ying,
- Abstract summary: We introduce Marlin, which approaches cyber attack detection through real-time provenance graph alignment.
Marlin can process 137K events per second while accurately identifying 120 subgraphs with 31 confirmed attacks, along with only 1 false positive.
- Score: 32.77246634664381
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research in both academia and industry has validated the effectiveness of provenance graph-based detection for advanced cyber attack detection and investigation. However, analyzing large-scale provenance graphs often results in substantial overhead. To improve performance, existing detection systems implement various optimization strategies. Yet, as several recent studies suggest, these strategies could lose necessary context information and be vulnerable to evasions. Designing a detection system that is efficient and robust against adversarial attacks is an open problem. We introduce Marlin, which approaches cyber attack detection through real-time provenance graph alignment.By leveraging query graphs embedded with attack knowledge, Marlin can efficiently identify entities and events within provenance graphs, embedding targeted analysis and significantly narrowing the search space. Moreover, we incorporate our graph alignment algorithm into a tag propagation-based schema to eliminate the need for storing and reprocessing raw logs. This design significantly reduces in-memory storage requirements and minimizes data processing overhead. As a result, it enables real-time graph alignment while preserving essential context information, thereby enhancing the robustness of cyber attack detection. Moreover, Marlin allows analysts to customize attack query graphs flexibly to detect extended attacks and provide interpretable detection results. We conduct experimental evaluations on two large-scale public datasets containing 257.42 GB of logs and 12 query graphs of varying sizes, covering multiple attack techniques and scenarios. The results show that Marlin can process 137K events per second while accurately identifying 120 subgraphs with 31 confirmed attacks, along with only 1 false positive, demonstrating its efficiency and accuracy in handling massive data.
Related papers
- LogSHIELD: A Graph-based Real-time Anomaly Detection Framework using Frequency Analysis [3.140349394142226]
We present LogSHIELD, a graph-based anomaly detection model in host data.
It can detect stealthy and sophisticated attacks with over 98% average AUC and F1 scores.
It significantly improves throughput, achieves an average detection latency of 0.13 seconds, and outperforms state-of-the-art models in detection time.
arXiv Detail & Related papers (2024-10-29T10:52:43Z) - GraphGuard: Detecting and Counteracting Training Data Misuse in Graph
Neural Networks [69.97213941893351]
The emergence of Graph Neural Networks (GNNs) in graph data analysis has raised critical concerns about data misuse during model training.
Existing methodologies address either data misuse detection or mitigation, and are primarily designed for local GNN models.
This paper introduces a pioneering approach called GraphGuard, to tackle these challenges.
arXiv Detail & Related papers (2023-12-13T02:59:37Z) - Effective In-vehicle Intrusion Detection via Multi-view Statistical
Graph Learning on CAN Messages [9.04771951523525]
In-vehicle network (IVN) is facing a wide variety of complex and changing external cyber-attacks.
Only coarse-grained recognition can be achieved in current mainstream intrusion detection mechanisms.
We propose StatGraph: an Effective Multi-view Statistical Graph Learning Intrusion Detection.
arXiv Detail & Related papers (2023-11-13T03:49:55Z) - GraphCloak: Safeguarding Task-specific Knowledge within Graph-structured Data from Unauthorized Exploitation [61.80017550099027]
Graph Neural Networks (GNNs) are increasingly prevalent in a variety of fields.
Growing concerns have emerged regarding the unauthorized utilization of personal data.
Recent studies have shown that imperceptible poisoning attacks are an effective method of protecting image data from such misuse.
This paper introduces GraphCloak to safeguard against the unauthorized usage of graph data.
arXiv Detail & Related papers (2023-10-11T00:50:55Z) - Kairos: Practical Intrusion Detection and Investigation using
Whole-system Provenance [4.101641763092759]
Provenance graphs are structured audit logs that describe the history of a system's execution.
We identify four common dimensions that drive the development of provenance-based intrusion detection systems (PIDSes)
We present KAIROS, the first PIDS that simultaneously satisfies the desiderata in all four dimensions.
arXiv Detail & Related papers (2023-08-09T16:04:55Z) - Model Inversion Attacks against Graph Neural Networks [65.35955643325038]
We study model inversion attacks against Graph Neural Networks (GNNs)
In this paper, we present GraphMI to infer the private training graph data.
Our experimental results show that such defenses are not sufficiently effective and call for more advanced defenses against privacy attacks.
arXiv Detail & Related papers (2022-09-16T09:13:43Z) - Deep Fraud Detection on Non-attributed Graph [61.636677596161235]
Graph Neural Networks (GNNs) have shown solid performance on fraud detection.
labeled data is scarce in large-scale industrial problems, especially for fraud detection.
We propose a novel graph pre-training strategy to leverage more unlabeled data.
arXiv Detail & Related papers (2021-10-04T03:42:09Z) - Information Obfuscation of Graph Neural Networks [96.8421624921384]
We study the problem of protecting sensitive attributes by information obfuscation when learning with graph structured data.
We propose a framework to locally filter out pre-determined sensitive attributes via adversarial training with the total variation and the Wasserstein distance.
arXiv Detail & Related papers (2020-09-28T17:55:04Z) - Graph Backdoor [53.70971502299977]
We present GTA, the first backdoor attack on graph neural networks (GNNs)
GTA departs in significant ways: it defines triggers as specific subgraphs, including both topological structures and descriptive features.
It can be instantiated for both transductive (e.g., node classification) and inductive (e.g., graph classification) tasks.
arXiv Detail & Related papers (2020-06-21T19:45:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.