MultiKG: Multi-Source Threat Intelligence Aggregation for High-Quality Knowledge Graph Representation of Attack Techniques
- URL: http://arxiv.org/abs/2411.08359v1
- Date: Wed, 13 Nov 2024 06:15:48 GMT
- Title: MultiKG: Multi-Source Threat Intelligence Aggregation for High-Quality Knowledge Graph Representation of Attack Techniques
- Authors: Jian Wang, Tiantian Zhu, Chunlin Xiong, Yan Chen,
- Abstract summary: We propose MultiKG, a fully automated framework that integrates multiple threat knowledge sources.
We implemented MultiKG and evaluated it using 1,015 real attack techniques and 9,006 attack intelligence entries from CTI reports.
Results show that MultiKG effectively extracts attack knowledge graphs from diverse sources and aggregates them into accurate, comprehensive representations.
- Score: 7.4166591335540595
- License:
- Abstract: The construction of attack technique knowledge graphs aims to transform various types of attack knowledge into structured representations for more effective attack procedure modeling. Existing methods typically rely on textual data, such as Cyber Threat Intelligence (CTI) reports, which are often coarse-grained and unstructured, resulting in incomplete and inaccurate knowledge graphs. To address these issues, we expand attack knowledge sources by incorporating audit logs and static code analysis alongside CTI reports, providing finer-grained data for constructing attack technique knowledge graphs. We propose MultiKG, a fully automated framework that integrates multiple threat knowledge sources. MultiKG processes data from CTI reports, dynamic logs, and static code separately, then merges them into a unified attack knowledge graph. Through system design and the utilization of the Large Language Model (LLM), MultiKG automates the analysis, construction, and merging of attack graphs across these sources, producing a fine-grained, multi-source attack knowledge graph. We implemented MultiKG and evaluated it using 1,015 real attack techniques and 9,006 attack intelligence entries from CTI reports. Results show that MultiKG effectively extracts attack knowledge graphs from diverse sources and aggregates them into accurate, comprehensive representations. Through case studies, we demonstrate that our approach directly benefits security tasks such as attack reconstruction and detection.
Related papers
- CTINEXUS: Leveraging Optimized LLM In-Context Learning for Constructing Cybersecurity Knowledge Graphs Under Data Scarcity [49.657358248788945]
Textual descriptions in cyber threat intelligence (CTI) reports are rich sources of knowledge about cyber threats.
Current CTI extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction.
We propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models.
arXiv Detail & Related papers (2024-10-28T14:18:32Z) - KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment [38.312774244521]
We propose a knowledge graph-based verifier for Cyber Threat Intelligence (CTI) quality assessment framework.
Our approach introduces Large Language Models (LLMs) to automatically extract OSCTI key claims to be verified.
To fill the gap in the research field, we created and made public the first dataset for threat intelligence assessment from heterogeneous sources.
arXiv Detail & Related papers (2024-08-15T11:32:46Z) - Using Retriever Augmented Large Language Models for Attack Graph Generation [0.7619404259039284]
This paper explores the approach of leveraging large language models (LLMs) to automate the generation of attack graphs.
It shows how to utilize Common Vulnerabilities and Exposures (CommonLLMs) to create attack graphs from threat reports.
arXiv Detail & Related papers (2024-08-11T19:59:08Z) - AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models [17.89951919370619]
Large Language Models (LLMs) have achieved enormous success in a broad range of tasks.
Our framework consists of four consecutive modules: rewriter, identifier, and summarizer.
We represent a cyber attack as a temporally unfolding event, each temporal step of which encapsulates three layers of representation.
arXiv Detail & Related papers (2024-05-08T01:41:25Z) - Streamlining Attack Tree Generation: A Fragment-Based Approach [39.157069600312774]
We present a novel fragment-based attack graph generation approach that utilizes information from publicly available information security databases.
We also propose a domain-specific language for attack modeling, which we employ in the proposed attack graph generation approach.
arXiv Detail & Related papers (2023-10-01T12:41:38Z) - ThreatKG: An AI-Powered System for Automated Open-Source Cyber Threat Intelligence Gathering and Management [65.0114141380651]
ThreatKG is an automated system for OSCTI gathering and management.
It efficiently collects a large number of OSCTI reports from multiple sources.
It uses specialized AI-based techniques to extract high-quality knowledge about various threat entities.
arXiv Detail & Related papers (2022-12-20T16:13:59Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - EXTRACTOR: Extracting Attack Behavior from Threat Reports [6.471387545969443]
We propose a novel approach and tool called provenanceOR that allows precise automatic extraction of concise attack behaviors from CTI reports.
provenanceOR makes no strong assumptions about the text and is capable of extracting attack behaviors as graphs from unstructured text.
Our evaluation results show that provenanceOR can extract concise graphs from CTI reports and can successfully be used by cyber-analytics tools in threat-hunting.
arXiv Detail & Related papers (2021-04-17T18:51:00Z) - Information Obfuscation of Graph Neural Networks [96.8421624921384]
We study the problem of protecting sensitive attributes by information obfuscation when learning with graph structured data.
We propose a framework to locally filter out pre-determined sensitive attributes via adversarial training with the total variation and the Wasserstein distance.
arXiv Detail & Related papers (2020-09-28T17:55:04Z) - Reinforcement Learning-based Black-Box Evasion Attacks to Link
Prediction in Dynamic Graphs [87.5882042724041]
Link prediction in dynamic graphs (LPDG) is an important research problem that has diverse applications.
We study the vulnerability of LPDG methods and propose the first practical black-box evasion attack.
arXiv Detail & Related papers (2020-09-01T01:04:49Z) - Graph Backdoor [53.70971502299977]
We present GTA, the first backdoor attack on graph neural networks (GNNs)
GTA departs in significant ways: it defines triggers as specific subgraphs, including both topological structures and descriptive features.
It can be instantiated for both transductive (e.g., node classification) and inductive (e.g., graph classification) tasks.
arXiv Detail & Related papers (2020-06-21T19:45:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.