GraphWeaver: Billion-Scale Cybersecurity Incident Correlation
- URL: http://arxiv.org/abs/2406.01842v1
- Date: Mon, 3 Jun 2024 23:28:05 GMT
- Title: GraphWeaver: Billion-Scale Cybersecurity Incident Correlation
- Authors: Scott Freitas, Amir Gharib,
- Abstract summary: We introduce GraphWeaver, an industry-scale framework that shifts the traditional incident correlation process to a data-optimized, geo-distributed graph based approach.
GraphWeaver is integrated into the Microsoft Defender XDR product and deployed worldwide, handling billions of correlations with a 99% accuracy rate.
This integration has not only maintained high correlation accuracy but reduces traditional correlation storage requirements by 7.4x.
- Score: 2.2572772235310934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the dynamic landscape of large enterprise cybersecurity, accurately and efficiently correlating billions of security alerts into comprehensive incidents is a substantial challenge. Traditional correlation techniques often struggle with maintenance, scaling, and adapting to emerging threats and novel sources of telemetry. We introduce GraphWeaver, an industry-scale framework that shifts the traditional incident correlation process to a data-optimized, geo-distributed graph based approach. GraphWeaver introduces a suite of innovations tailored to handle the complexities of correlating billions of shared evidence alerts across hundreds of thousands of enterprises. Key among these innovations are a geo-distributed database and PySpark analytics engine for large-scale data processing, a minimum spanning tree algorithm to optimize correlation storage, integration of security domain knowledge and threat intelligence, and a human-in-the-loop feedback system to continuously refine key correlation processes and parameters. GraphWeaver is integrated into the Microsoft Defender XDR product and deployed worldwide, handling billions of correlations with a 99% accuracy rate, as confirmed by customer feedback and extensive investigations by security experts. This integration has not only maintained high correlation accuracy but reduces traditional correlation storage requirements by 7.4x. We provide an in-depth overview of the key design and operational features of GraphWeaver, setting a precedent as the first cybersecurity company to openly discuss these critical capabilities at this level of depth.
Related papers
- VulRG: Multi-Level Explainable Vulnerability Patch Ranking for Complex Systems Using Graphs [20.407534993667607]
This work introduces a graph-based framework for vulnerability patch prioritization.
It integrates diverse data sources and metrics into a universally applicable model.
refined risk metrics enable detailed assessments at the component, asset, and system levels.
arXiv Detail & Related papers (2025-02-16T14:21:52Z) - RelGNN: Composite Message Passing for Relational Deep Learning [56.48834369525997]
We introduce RelGNN, a novel GNN framework specifically designed to capture the unique characteristics of relational databases.
At the core of our approach is the introduction of atomic routes, which are sequences of nodes forming high-order tripartite structures.
RelGNN consistently achieves state-of-the-art accuracy with up to 25% improvement.
arXiv Detail & Related papers (2025-02-10T18:58:40Z) - Federated Granger Causality Learning for Interdependent Clients with State Space Representation [0.6499759302108926]
We develop a federated approach to learning Granger causality.
We propose augmenting the client models with the Granger causality information learned by the server.
We also study the convergence of the framework to a centralized oracle model.
arXiv Detail & Related papers (2025-01-23T18:04:21Z) - PriRoAgg: Achieving Robust Model Aggregation with Minimum Privacy Leakage for Federated Learning [49.916365792036636]
Federated learning (FL) has recently gained significant momentum due to its potential to leverage large-scale distributed user data.
The transmitted model updates can potentially leak sensitive user information, and the lack of central control of the local training process leaves the global model susceptible to malicious manipulations on model updates.
We develop a general framework PriRoAgg, utilizing Lagrange coded computing and distributed zero-knowledge proof, to execute a wide range of robust aggregation algorithms while satisfying aggregated privacy.
arXiv Detail & Related papers (2024-07-12T03:18:08Z) - Privacy-Preserving Intrusion Detection using Convolutional Neural Networks [0.25163931116642785]
We explore the use case of a model owner providing an analytic service on customer's private data.
No information about the data shall be revealed to the analyst and no information about the model shall be leaked to the customer.
We enhance an attack detection system based on Convolutional Neural Networks with privacy-preserving technology based on PriMIA framework.
arXiv Detail & Related papers (2024-04-15T09:56:36Z) - It Is Time To Steer: A Scalable Framework for Analysis-driven Attack Graph Generation [50.06412862964449]
Attack Graph (AG) represents the best-suited solution to support cyber risk assessment for multi-step attacks on computer networks.
Current solutions propose to address the generation problem from the algorithmic perspective and postulate the analysis only after the generation is complete.
This paper rethinks the classic AG analysis through a novel workflow in which the analyst can query the system anytime.
arXiv Detail & Related papers (2023-12-27T10:44:58Z) - Fed-urlBERT: Client-side Lightweight Federated Transformers for URL Threat Analysis [6.552094912099549]
Federated URL pre-trained model designed to address both privacy concerns and the need for cross-domain collaboration in cybersecurity.
Our appraoch achieves performance comparable to centralized model under both independently and identically distributed (IID) and two non-IID data scenarios.
arXiv Detail & Related papers (2023-12-06T17:31:16Z) - Graph Mining for Cybersecurity: A Survey [61.505995908021525]
The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society.
Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities.
With the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance.
arXiv Detail & Related papers (2023-04-02T08:43:03Z) - Privacy-preserving Graph Analytics: Secure Generation and Federated
Learning [72.90158604032194]
We focus on the privacy-preserving analysis of graph data, which provides the crucial capacity to represent rich attributes and relationships.
We discuss two directions, namely privacy-preserving graph generation and federated graph learning, which can jointly enable the collaboration among multiple parties each possessing private graph data.
arXiv Detail & Related papers (2022-06-30T18:26:57Z) - Information Obfuscation of Graph Neural Networks [96.8421624921384]
We study the problem of protecting sensitive attributes by information obfuscation when learning with graph structured data.
We propose a framework to locally filter out pre-determined sensitive attributes via adversarial training with the total variation and the Wasserstein distance.
arXiv Detail & Related papers (2020-09-28T17:55:04Z) - PicoDomain: A Compact High-Fidelity Cybersecurity Dataset [0.9281671380673305]
Current cybersecurity datasets either offer no ground truth or do so with anonymized data.
Most existing datasets are large enough to make them unwieldy during prototype development.
In this paper we have developed the PicoDomain dataset, a compact high-fidelity collection of Zeek logs from a realistic intrusion.
arXiv Detail & Related papers (2020-08-20T20:18:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.