CRUcialG: Reconstruct Integrated Attack Scenario Graphs by Cyber Threat Intelligence Reports
- URL: http://arxiv.org/abs/2410.11209v1
- Date: Tue, 15 Oct 2024 02:50:59 GMT
- Title: CRUcialG: Reconstruct Integrated Attack Scenario Graphs by Cyber Threat Intelligence Reports
- Authors: Wenrui Cheng, Tiantian Zhu, Tieming Chen, Qixuan Yuan, Jie Ying, Hongmei Li, Chunlin Xiong, Mingda Li, Mingqi Lv, Yan Chen,
- Abstract summary: We propose a system called CRUcialG for the automated reconstruction of attack scenario graphs (ASGs) by CTI reports.
First, we use NLP models to extract systematic attack knowledge from CTI reports to form preliminary ASGs.
Then, we propose a four-phase attack verification framework from the tactical phase with attack procedure to evaluate the reasonability of ASGs.
- Score: 9.466898583539214
- License:
- Abstract: Cyber Threat Intelligence (CTI) reports are factual records compiled by security analysts through their observations of threat events or their own practical experience with attacks. In order to utilize CTI reports for attack detection, existing methods have attempted to map the content of reports onto system-level attack provenance graphs to clearly depict attack procedures. However, existing studies on constructing graphs from CTI reports suffer from problems such as weak natural language processing (NLP) capabilities, discrete and fragmented graphs, and insufficient attack semantic representation. Therefore, we propose a system called CRUcialG for the automated reconstruction of attack scenario graphs (ASGs) by CTI reports. First, we use NLP models to extract systematic attack knowledge from CTI reports to form preliminary ASGs. Then, we propose a four-phase attack rationality verification framework from the tactical phase with attack procedure to evaluate the reasonability of ASGs. Finally, we implement the relation repair and phase supplement of ASGs by adopting a serialized graph generation model. We collect a total of 10,607 CTI reports and generate 5,761 complete ASGs. Experimental results on CTI reports from 30 security vendors and DARPA show that the similarity of ASG reconstruction by CRUcialG can reach 84.54%. Compared with SOTA (EXTRACTOR and AttackG), the recall of CRUcialG (extraction of real attack events) can reach 88.13% and 94.46% respectively, which is 40% higher than SOTA on average. The F1-score of attack phase verification is able to reach 90.04%.
Related papers
- MultiKG: Multi-Source Threat Intelligence Aggregation for High-Quality Knowledge Graph Representation of Attack Techniques [7.4166591335540595]
We propose MultiKG, a fully automated framework that integrates multiple threat knowledge sources.
We implemented MultiKG and evaluated it using 1,015 real attack techniques and 9,006 attack intelligence entries from CTI reports.
Results show that MultiKG effectively extracts attack knowledge graphs from diverse sources and aggregates them into accurate, comprehensive representations.
arXiv Detail & Related papers (2024-11-13T06:15:48Z) - Data Extraction Attacks in Retrieval-Augmented Generation via Backdoors [15.861833242429228]
We investigate data extraction attacks targeting the knowledge databases of Retrieval-Augmented Generation (RAG) systems.
To reveal the vulnerability, we propose to backdoor RAG, where a small portion of poisoned data is injected during the fine-tuning phase to create a backdoor within the LLM.
arXiv Detail & Related papers (2024-11-03T22:27:40Z) - Nip in the Bud: Forecasting and Interpreting Post-exploitation Attacks in Real-time through Cyber Threat Intelligence Reports [6.954623537148434]
Advanced Persistent Threat (APT) attacks have caused significant damage worldwide.
Various Detection and Response (EDR) systems are deployed by enterprises to fight against potential threats.
Analysts need to investigate and filter detection results before taking countermeasures.
We propose Forecasting and Interpreting (EFI), a real-time attack forecast and interpretation system.
arXiv Detail & Related papers (2024-05-05T06:25:52Z) - Client-side Gradient Inversion Against Federated Learning from Poisoning [59.74484221875662]
Federated Learning (FL) enables distributed participants to train a global model without sharing data directly to a central server.
Recent studies have revealed that FL is vulnerable to gradient inversion attack (GIA), which aims to reconstruct the original training samples.
We propose Client-side poisoning Gradient Inversion (CGI), which is a novel attack method that can be launched from clients.
arXiv Detail & Related papers (2023-09-14T03:48:27Z) - G$^2$uardFL: Safeguarding Federated Learning Against Backdoor Attacks
through Attributed Client Graph Clustering [116.4277292854053]
Federated Learning (FL) offers collaborative model training without data sharing.
FL is vulnerable to backdoor attacks, where poisoned model weights lead to compromised system integrity.
We present G$2$uardFL, a protective framework that reinterprets the identification of malicious clients as an attributed graph clustering problem.
arXiv Detail & Related papers (2023-06-08T07:15:04Z) - Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting [58.91947205027892]
Federated learning has exhibited vulnerabilities to Byzantine attacks.
Byzantine attackers can send arbitrary gradients to a central server to destroy the convergence and performance of the global model.
A wealth of robust AGgregation Rules (AGRs) have been proposed to defend against Byzantine attacks.
arXiv Detail & Related papers (2023-02-13T03:31:50Z) - Certified Robustness Against Natural Language Attacks by Causal
Intervention [61.62348826831147]
Causal Intervention by Semantic Smoothing (CISS) is a novel framework towards robustness against natural language attacks.
CISS is provably robust against word substitution attacks, as well as empirically robust even when perturbations are strengthened by unknown attack algorithms.
arXiv Detail & Related papers (2022-05-24T19:20:48Z) - On Trace of PGD-Like Adversarial Attacks [77.75152218980605]
Adversarial attacks pose safety and security concerns for deep learning applications.
We construct Adrial Response Characteristics (ARC) features to reflect the model's gradient consistency.
Our method is intuitive, light-weighted, non-intrusive, and data-undemanding.
arXiv Detail & Related papers (2022-05-19T14:26:50Z) - EXTRACTOR: Extracting Attack Behavior from Threat Reports [6.471387545969443]
We propose a novel approach and tool called provenanceOR that allows precise automatic extraction of concise attack behaviors from CTI reports.
provenanceOR makes no strong assumptions about the text and is capable of extracting attack behaviors as graphs from unstructured text.
Our evaluation results show that provenanceOR can extract concise graphs from CTI reports and can successfully be used by cyber-analytics tools in threat-hunting.
arXiv Detail & Related papers (2021-04-17T18:51:00Z) - Evaluating the Robustness of Geometry-Aware Instance-Reweighted
Adversarial Training [9.351384969104771]
We evaluate the robustness of a method called "Geometry-aware Instance-reweighted Adversarial Training"
We find that a network trained with this method is biasing the model towards certain samples by re-scaling the loss.
arXiv Detail & Related papers (2021-03-02T18:15:42Z) - Graph Backdoor [53.70971502299977]
We present GTA, the first backdoor attack on graph neural networks (GNNs)
GTA departs in significant ways: it defines triggers as specific subgraphs, including both topological structures and descriptive features.
It can be instantiated for both transductive (e.g., node classification) and inductive (e.g., graph classification) tasks.
arXiv Detail & Related papers (2020-06-21T19:45:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.