From Retrieval to Reasoning: A Framework for Cyber Threat Intelligence NER with Explicit and Adaptive Instructions
- URL: http://arxiv.org/abs/2512.19414v1
- Date: Mon, 22 Dec 2025 14:13:01 GMT
- Title: From Retrieval to Reasoning: A Framework for Cyber Threat Intelligence NER with Explicit and Adaptive Instructions
- Authors: Jiaren Peng, Hongda Sun, Xuan Tian, Cheng Huang, Zeqing Li, Rui Yan,
- Abstract summary: TTPrompt is a framework shifting from implicit induction to explicit instruction.<n> FIR enables LLMs to self-refine guidelines by learning from errors on minimal labeled data.<n>With refinement on just 1% of training data, TTPrompt rivals models fine-tuned on the full dataset.
- Score: 15.710492251334792
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The automation of Cyber Threat Intelligence (CTI) relies heavily on Named Entity Recognition (NER) to extract critical entities from unstructured text. Currently, Large Language Models (LLMs) primarily address this task through retrieval-based In-Context Learning (ICL). This paper analyzes this mainstream paradigm, revealing a fundamental flaw: its success stems not from global semantic similarity but largely from the incidental overlap of entity types within retrieved examples. This exposes the limitations of relying on unreliable implicit induction. To address this, we propose TTPrompt, a framework shifting from implicit induction to explicit instruction. TTPrompt maps the core concepts of CTI's Tactics, Techniques, and Procedures (TTPs) into an instruction hierarchy: formulating task definitions as Tactics, guiding strategies as Techniques, and annotation guidelines as Procedures. Furthermore, to handle the adaptability challenge of static guidelines, we introduce Feedback-driven Instruction Refinement (FIR). FIR enables LLMs to self-refine guidelines by learning from errors on minimal labeled data, adapting to distinct annotation dialects. Experiments on five CTI NER benchmarks demonstrate that TTPrompt consistently surpasses retrieval-based baselines. Notably, with refinement on just 1% of training data, it rivals models fine-tuned on the full dataset. For instance, on LADDER, its Micro F1 of 71.96% approaches the fine-tuned baseline, and on the complex CTINexus, its Macro F1 exceeds the fine-tuned ACLM model by 10.91%.
Related papers
- The Procedural Semantics Gap in Structured CTI: A Measurement-Driven STIX Analysis for APT Emulation [0.5399800035598185]
Cyber threat intelligence (CTI) encoded in STIX and structured according to the MITRE ATT&CK framework has become a global reference for describing adversary behavior.<n>We ask whether its structured artifacts contain sufficient behavioral detail to support multi-stage adversary emulation.
arXiv Detail & Related papers (2025-12-12T22:53:52Z) - KBQA-R1: Reinforcing Large Language Models for Knowledge Base Question Answering [64.62317305868264]
We present textbfKBQA-R1, a framework that shifts the paradigm from text imitation to interaction optimization via Reinforcement Learning.<n>Treating KBQA as a multi-turn decision process, our model learns to navigate the knowledge base using a list of actions.<n>Experiments on WebQSP, GrailQA, and GraphQuestions demonstrate that KBQA-R1 achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-12-10T17:45:42Z) - PANER: A Paraphrase-Augmented Framework for Low-Resource Named Entity Recognition [9.164874578520722]
We present a lightweight few-shot NER framework that combines principles from prior IT approaches to leverage the large context window of recent state-of-the-art LLMs.<n> Experiments on benchmark datasets show that our method achieves performance comparable to state-of-the-art models on few-shot and zero-shot tasks.
arXiv Detail & Related papers (2025-10-20T16:36:18Z) - CoT Referring: Improving Referring Expression Tasks with Grounded Reasoning [67.18702329644526]
CoT Referring enhances model reasoning across modalities through a structured, chain-of-thought training data structure.<n>We restructure the training data to enforce a new output form, providing new annotations for existing datasets.<n>We also integrate detection and segmentation capabilities into a unified MLLM framework, training it with a novel adaptive weighted loss to optimize performance.
arXiv Detail & Related papers (2025-10-03T08:50:21Z) - LRCTI: A Large Language Model-Based Framework for Multi-Step Evidence Retrieval and Reasoning in Cyber Threat Intelligence Credibility Verification [7.608817324043705]
We propose LRCTI, a framework designed for multi-step Cyber Threat Intelligence credibility verification.<n>The framework first employs a text summarization module to distill complex intelligence reports into concise and actionable threat claims.<n>It then uses an adaptive multi-step evidence retrieval mechanism that iteratively identifies and refines supporting information from a CTI-specific corpus.<n>Experiments conducted on two benchmark datasets, CTI-200 and PolitiFact show that LRCTI improves F1-Macro and F1-Micro scores by over 5%, reaching 90.9% and 93.6%, respectively.
arXiv Detail & Related papers (2025-07-15T13:42:32Z) - DecIF: Improving Instruction-Following through Meta-Decomposition [9.939860059820917]
DecIF is a fully autonomous, meta-decomposition guided framework that generates diverse and high-quality instruction-following data.<n>For instruction generation, we guide LLMs to iteratively produce various types of meta-information, which are then combined with response constraints to form semantically rich instructions.<n>For response generation, we decompose each instruction into atomic-level evaluation criteria, enabling rigorous validation and the elimination of inaccurate instruction-response pairs.
arXiv Detail & Related papers (2025-05-20T06:38:28Z) - Instantiating Standards: Enabling Standard-Driven Text TTP Extraction with Evolvable Memory [4.909107168534244]
We introduce a novel framework that converts abstract standard definitions into actionable, contextualized knowledge.<n>Our method utilizes Large Language Model (LLM) to generate, update, and apply this knowledge.<n> Experiments show our framework boosts Technique F1 scores by 11% over GPT-4o.
arXiv Detail & Related papers (2025-05-14T10:22:13Z) - Learning Task Representations from In-Context Learning [67.66042137487287]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning (ICL)<n>We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.<n>The proposed method successfully extracts task-specific information from in-context demonstrations and excels in both text and regression tasks.
arXiv Detail & Related papers (2025-02-08T00:16:44Z) - Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction [75.25114727856861]
Large language models (LLMs) tend to suffer from deterioration at the latter stage ofSupervised fine-tuning process.
We introduce a simple disperse-then-merge framework to address the issue.
Our framework outperforms various sophisticated methods such as data curation and training regularization on a series of standard knowledge and reasoning benchmarks.
arXiv Detail & Related papers (2024-05-22T08:18:19Z) - M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [58.617025733655005]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning)<n>It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario.<n>Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z) - Supporting Vision-Language Model Inference with Confounder-pruning Knowledge Prompt [71.77504700496004]
Vision-language models are pre-trained by aligning image-text pairs in a common space to deal with open-set visual concepts.
To boost the transferability of the pre-trained models, recent works adopt fixed or learnable prompts.
However, how and what prompts can improve inference performance remains unclear.
arXiv Detail & Related papers (2022-05-23T07:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.