Related papers: Uncovering Vulnerabilities of LLM-Assisted Cyber Threat Intelligence

Uncovering Vulnerabilities of LLM-Assisted Cyber Threat Intelligence

URL: http://arxiv.org/abs/2509.23573v2
Date: Wed, 01 Oct 2025 15:57:32 GMT
Title: Uncovering Vulnerabilities of LLM-Assisted Cyber Threat Intelligence
Authors: Yuqiao Meng, Luoxi Tang, Feiyang Yu, Jinyuan Jia, Guanhua Yan, Ping Yang, Zhaohan Xi,
Abstract summary: Large Language Models (LLMs) are intensively used to assist security analysts in counteracting the rapid exploitation of cyber threats.<n>In this paper, we investigate the intrinsic vulnerabilities of LLMs in cyber threat intelligence (CTI)<n>We introduce a novel categorization methodology that integrates stratification, autoregressive refinement, and human-in-the-loop supervision.
Score: 15.881854286231997
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) are intensively used to assist security analysts in counteracting the rapid exploitation of cyber threats, wherein LLMs offer cyber threat intelligence (CTI) to support vulnerability assessment and incident response. While recent work has shown that LLMs can support a wide range of CTI tasks such as threat analysis, vulnerability detection, and intrusion defense, significant performance gaps persist in practical deployments. In this paper, we investigate the intrinsic vulnerabilities of LLMs in CTI, focusing on challenges that arise from the nature of the threat landscape itself rather than the model architecture. Using large-scale evaluations across multiple CTI benchmarks and real-world threat reports, we introduce a novel categorization methodology that integrates stratification, autoregressive refinement, and human-in-the-loop supervision to reliably analyze failure instances. Through extensive experiments and human inspections, we reveal three fundamental vulnerabilities: spurious correlations, contradictory knowledge, and constrained generalization, that limit LLMs in effectively supporting CTI. Subsequently, we provide actionable insights for designing more robust LLM-powered CTI systems to facilitate future research.

Related papers

CTIArena: Benchmarking LLM Knowledge and Reasoning Across Heterogeneous Cyber Threat Intelligence [48.63397742510097]
Cyber threat intelligence (CTI) is central to modern cybersecurity, providing critical insights for detecting and mitigating evolving threats.<n>With the natural language understanding and reasoning capabilities of large language models (LLMs), there is increasing interest in applying them to CTI.<n>We present CTIArena, the first benchmark for evaluating LLM performance on heterogeneous, multi-source CTI.
arXiv Detail & Related papers (2025-10-13T22:10:17Z)
POLAR: Automating Cyber Threat Prioritization through LLM-Powered Assessment [13.18964488705143]
Large Language Models (LLMs) are intensively used to assist security analysts in counteracting the rapid exploitation of cyber threats.<n>In this paper, we investigate the intrinsic vulnerabilities of LLMs in cyber threat intelligence (CTI)<n>We introduce a novel categorization methodology that integrates stratification, autoregressive refinement, and human-in-the-loop supervision.
arXiv Detail & Related papers (2025-10-02T00:49:20Z)
NeuroBreak: Unveil Internal Jailbreak Mechanisms in Large Language Models [68.09675063543402]
NeuroBreak is a top-down jailbreak analysis system designed to analyze neuron-level safety mechanisms and mitigate vulnerabilities.<n>By incorporating layer-wise representation probing analysis, NeuroBreak offers a novel perspective on the model's decision-making process.<n>We conduct quantitative evaluations and case studies to verify the effectiveness of our system.
arXiv Detail & Related papers (2025-09-04T08:12:06Z)
Advancing Autonomous Incident Response: Leveraging LLMs and Cyber Threat Intelligence [3.2284427438223013]
Security teams are overwhelmed by alert fatigue, high false-positive rates, and the vast volume of unstructured Cyber Threat Intelligence (CTI) documents.<n>We introduce a novel Retrieval-Augmented Generation (RAG)-based framework that leverages Large Language Models (LLMs) to automate and enhance IR.<n>Our approach introduces a hybrid retrieval mechanism that combines NLP-based similarity searches within a CTI vector database with standardized queries to external CTI platforms.
arXiv Detail & Related papers (2025-08-14T14:20:34Z)
AttackSeqBench: Benchmarking Large Language Models' Understanding of Sequential Patterns in Cyber Attacks [13.082370325093242]
We introduce AttackSeqBench, a benchmark to evaluate Large Language Models' (LLMs) capability to understand and reason attack sequences in Cyber Threat Intelligence (CTI) reports.<n>Our benchmark encompasses three distinct Question Answering (QA) tasks, each task focuses on the varying granularity in adversarial behavior.<n>We conduct extensive experiments and analysis with both fast-thinking and slow-thinking LLMs, while highlighting their strengths and limitations in analyzing the sequential patterns in cyber attacks.
arXiv Detail & Related papers (2025-03-05T04:25:21Z)
Beyond the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models [19.781204384395064]
Small language models (SLMs) have become increasingly prominent in the deployment on edge devices due to their high efficiency and low computational cost.<n>We provide a comprehensive empirical study to evaluate the security performance of 13 state-of-the-art SLMs under various jailbreak attacks.<n>Our experiments demonstrate that most SLMs are quite susceptible to existing jailbreak attacks, while some of them are even vulnerable to direct harmful prompts.
arXiv Detail & Related papers (2025-02-27T08:44:04Z)
Adversarial Reasoning at Jailbreaking Time [49.70772424278124]
Large language models (LLMs) are becoming more capable and widespread.<n>Recent advances in standardizing, measuring, and scaling test-time compute suggest new methodologies for optimizing models to achieve high performance on hard tasks.<n>In this paper, we apply these advances to the task of model jailbreaking: eliciting harmful responses from aligned LLMs.
arXiv Detail & Related papers (2025-02-03T18:59:01Z)
Global Challenge for Safe and Secure LLMs Track 1 [57.08717321907755]
The Global Challenge for Safe and Secure Large Language Models (LLMs) is a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) This paper introduces the Global Challenge for Safe and Secure Large Language Models (LLMs), a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) to foster the development of advanced defense mechanisms against automated jailbreaking attacks.
arXiv Detail & Related papers (2024-11-21T08:20:31Z)
Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents [67.07177243654485]
This survey collects and analyzes the different threats faced by large language models-based agents. We identify six key features of LLM-based agents, based on which we summarize the current research progress. We select four representative agents as case studies to analyze the risks they may face in practical use.
arXiv Detail & Related papers (2024-11-14T15:40:04Z)
CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence [0.7499722271664147]
CTIBench is a benchmark designed to assess Large Language Models' performance in CTI applications. Our evaluation of several state-of-the-art models on these tasks provides insights into their strengths and weaknesses in CTI contexts.
arXiv Detail & Related papers (2024-06-11T16:42:02Z)
On the Vulnerability of LLM/VLM-Controlled Robotics [54.57914943017522]
We highlight vulnerabilities in robotic systems integrating large language models (LLMs) and vision-language models (VLMs) due to input modality sensitivities.<n>Our results show that simple input perturbations reduce task execution success rates by 22.2% and 14.6% in two representative LLM/VLM-controlled robotic systems.
arXiv Detail & Related papers (2024-02-15T22:01:45Z)
Data Poisoning for In-context Learning [49.77204165250528]
In-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks.<n>This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks.<n>We introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL.
arXiv Detail & Related papers (2024-02-03T14:20:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.