Related papers: AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models

AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models

URL: http://arxiv.org/abs/2405.04753v1
Date: Wed, 8 May 2024 01:41:25 GMT
Title: AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models
Authors: Yongheng Zhang, Tingwen Du, Yunshan Ma, Xiang Wang, Yi Xie, Guozheng Yang, Yuliang Lu, Ee-Chien Chang,
Abstract summary: Large Language Models (LLMs) have achieved enormous success in a broad range of tasks. Our framework consists of four consecutive modules: rewriter, identifier, and summarizer. We represent a cyber attack as a temporally unfolding event, each temporal step of which encapsulates three layers of representation.
Score: 17.89951919370619
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Attack knowledge graph construction seeks to convert textual cyber threat intelligence (CTI) reports into structured representations, portraying the evolutionary traces of cyber attacks. Even though previous research has proposed various methods to construct attack knowledge graphs, they generally suffer from limited generalization capability to diverse knowledge types as well as requirement of expertise in model design and tuning. Addressing these limitations, we seek to utilize Large Language Models (LLMs), which have achieved enormous success in a broad range of tasks given exceptional capabilities in both language understanding and zero-shot task fulfillment. Thus, we propose a fully automatic LLM-based framework to construct attack knowledge graphs named: AttacKG+. Our framework consists of four consecutive modules: rewriter, parser, identifier, and summarizer, each of which is implemented by instruction prompting and in-context learning empowered by LLMs. Furthermore, we upgrade the existing attack knowledge schema and propose a comprehensive version. We represent a cyber attack as a temporally unfolding event, each temporal step of which encapsulates three layers of representation, including behavior graph, MITRE TTP labels, and state summary. Extensive evaluation demonstrates that: 1) our formulation seamlessly satisfies the information needs in threat event analysis, 2) our construction framework is effective in faithfully and accurately extracting the information defined by AttacKG+, and 3) our attack graph directly benefits downstream security practices such as attack reconstruction. All the code and datasets will be released upon acceptance.

Related papers

MM-AttacKG: A Multimodal Approach to Attack Graph Construction with Large Language Models [12.2085847920673]
We propose a novel framework, MM-AttacKG, which can effectively extract key information from threat images and integrate it into attack graph construction.<n> MM-AttacKG can accurately identify key information in threat images and significantly improve the quality of multimodal attack graph construction.
arXiv Detail & Related papers (2025-06-20T12:59:31Z)
TrustGLM: Evaluating the Robustness of GraphLLMs Against Prompt, Text, and Structure Attacks [3.3238054848751535]
We introduce TrustGLM, a comprehensive study evaluating the vulnerability of GraphLLMs to adversarial attacks across three dimensions: text, graph structure, and prompt manipulations.<n>Our findings reveal that GraphLLMs are highly susceptible to text attacks that merely replace a few semantically similar words in a node's textual attribute.<n>We also find that standard graph structure attack methods can significantly degrade model performance, while random shuffling of the candidate label set in prompt templates leads to substantial performance drops.
arXiv Detail & Related papers (2025-06-13T14:48:01Z)
AttackSeqBench: Benchmarking Large Language Models' Understanding of Sequential Patterns in Cyber Attacks [13.082370325093242]
We introduce AttackSeqBench, a benchmark to evaluate Large Language Models' (LLMs) capability to understand and reason attack sequences in Cyber Threat Intelligence (CTI) reports. Our benchmark encompasses three distinct Question Answering (QA) tasks, each task focuses on the varying granularity in adversarial behavior. We conduct extensive experiments and analysis with both fast-thinking and slow-thinking LLMs, while highlighting their strengths and limitations in analyzing the sequential patterns in cyber attacks.
arXiv Detail & Related papers (2025-03-05T04:25:21Z)
MultiKG: Multi-Source Threat Intelligence Aggregation for High-Quality Knowledge Graph Representation of Attack Techniques [7.4166591335540595]
We propose MultiKG, a fully automated framework that integrates multiple threat knowledge sources. We implemented MultiKG and evaluated it using 1,015 real attack techniques and 9,006 attack intelligence entries from CTI reports. Results show that MultiKG effectively extracts attack knowledge graphs from diverse sources and aggregates them into accurate, comprehensive representations.
arXiv Detail & Related papers (2024-11-13T06:15:48Z)
CTINEXUS: Leveraging Optimized LLM In-Context Learning for Constructing Cybersecurity Knowledge Graphs Under Data Scarcity [49.657358248788945]
Textual descriptions in cyber threat intelligence (CTI) reports are rich sources of knowledge about cyber threats. Current CTI extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction. We propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models.
arXiv Detail & Related papers (2024-10-28T14:18:32Z)
Cyber Knowledge Completion Using Large Language Models [1.4883782513177093]
Integrating the Internet of Things (IoT) into Cyber-Physical Systems (CPSs) has expanded their cyber-attack surface. Assessing the risks of CPSs is increasingly difficult due to incomplete and outdated cybersecurity knowledge. Recent advancements in Large Language Models (LLMs) present a unique opportunity to enhance cyber-attack knowledge completion.
arXiv Detail & Related papers (2024-09-24T15:20:39Z)
KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment [38.312774244521]
We propose a knowledge graph-based verifier for Cyber Threat Intelligence (CTI) quality assessment framework. Our approach introduces Large Language Models (LLMs) to automatically extract OSCTI key claims to be verified. To fill the gap in the research field, we created and made public the first dataset for threat intelligence assessment from heterogeneous sources.
arXiv Detail & Related papers (2024-08-15T11:32:46Z)
Using Retriever Augmented Large Language Models for Attack Graph Generation [0.7619404259039284]
This paper explores the approach of leveraging large language models (LLMs) to automate the generation of attack graphs. It shows how to utilize Common Vulnerabilities and Exposures (CommonLLMs) to create attack graphs from threat reports.
arXiv Detail & Related papers (2024-08-11T19:59:08Z)
AnnoCTR: A Dataset for Detecting and Linking Entities, Tactics, and Techniques in Cyber Threat Reports [3.6785107661544805]
We present AnnoCTR, a new CC-BY-SA-licensed dataset of cyber threat reports. The reports have been annotated by a domain expert with named entities, temporal expressions, and cybersecurity-specific concepts. In our few-shot scenario, we find that for identifying the MITRE ATT&CK concepts that are mentioned explicitly or implicitly in a text, concept descriptions from MITRE ATT&CK are an effective source for training data augmentation.
arXiv Detail & Related papers (2024-04-11T14:04:36Z)
Contextualization Distillation from Large Language Model for Knowledge Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks. Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments. Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z)
Software Vulnerability Detection via Deep Learning over Disaggregated Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora. Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z)
Graph Information Bottleneck [77.21967740646784]
Graph Neural Networks (GNNs) provide an expressive way to fuse information from network structure and node features. Inheriting from the general Information Bottleneck (IB), GIB aims to learn the minimal sufficient representation for a given task. We show that our proposed models are more robust than state-of-the-art graph defense models.
arXiv Detail & Related papers (2020-10-24T07:13:00Z)
Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion [53.31911669146451]
Human-curated knowledge graphs provide critical supportive information to various natural language processing tasks. These graphs are usually incomplete, urging auto-completion of them. graph embedding approaches, e.g., TransE, learn structured knowledge via representing graph elements into dense embeddings. textual encoding approaches, e.g., KG-BERT, resort to graph triple's text and triple-level contextualized representations.
arXiv Detail & Related papers (2020-04-30T13:50:34Z)
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs. Building upon entity-level masked language models, our first contribution is an entity masking scheme. In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.