Rule-ATT&CK Mapper (RAM): Mapping SIEM Rules to TTPs Using LLMs
- URL: http://arxiv.org/abs/2502.02337v1
- Date: Tue, 04 Feb 2025 14:16:02 GMT
- Title: Rule-ATT&CK Mapper (RAM): Mapping SIEM Rules to TTPs Using LLMs
- Authors: Prasanna N. Wudali, Moshe Kravchik, Ehud Malul, Parth A. Gandhi, Yuval Elovici, Asaf Shabtai,
- Abstract summary: Rule-ATT&CK Mapper (RAM) is a framework that automates the mapping of structured SIEM rules to MITRE ATT&CK techniques.<n>RAM's multi-stage pipeline, which was inspired by the prompt chaining technique, enhances mapping accuracy without requiring LLM pre-training or fine-tuning.
- Score: 22.791057694472634
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growing frequency of cyberattacks has heightened the demand for accurate and efficient threat detection systems. SIEM platforms are important for analyzing log data and detecting adversarial activities through rule-based queries, also known as SIEM rules. The efficiency of the threat analysis process relies heavily on mapping these SIEM rules to the relevant attack techniques in the MITRE ATT&CK framework. Inaccurate annotation of SIEM rules can result in the misinterpretation of attacks, increasing the likelihood that threats will be overlooked. Existing solutions for annotating SIEM rules with MITRE ATT&CK technique labels have notable limitations: manual annotation of SIEM rules is both time-consuming and prone to errors, and ML-based approaches mainly focus on annotating unstructured free text sources rather than structured data like SIEM rules. Structured data often contains limited information, further complicating the annotation process and making it a challenging task. To address these challenges, we propose Rule-ATT&CK Mapper (RAM), a novel framework that leverages LLMs to automate the mapping of structured SIEM rules to MITRE ATT&CK techniques. RAM's multi-stage pipeline, which was inspired by the prompt chaining technique, enhances mapping accuracy without requiring LLM pre-training or fine-tuning. Using the Splunk Security Content dataset, we evaluate RAM's performance using several LLMs, including GPT-4-Turbo, Qwen, IBM Granite, and Mistral. Our evaluation highlights GPT-4-Turbo's superior performance, which derives from its enriched knowledge base, and an ablation study emphasizes the importance of external contextual knowledge in overcoming the limitations of LLMs' implicit knowledge for domain-specific tasks. These findings demonstrate RAM's potential in automating cybersecurity workflows and provide valuable insights for future advancements in this field.
Related papers
- MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model [54.14155564592936]
We propose a Mixture of Rule Experts guided by a Large Language Model (MoRE-LLM)
MoRE-LLM steers the discovery of local rule-based surrogates during training and their utilization for the classification task.
LLM is responsible for enhancing the domain knowledge alignment of the rules by correcting and contextualizing them.
arXiv Detail & Related papers (2025-03-26T11:09:21Z) - How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities [62.474732677086855]
Large language model (LLM) routing has emerged as a crucial strategy for balancing computational costs with performance.
We propose the DSC benchmark: Diverse, Simple, and Categorized, an evaluation framework that categorizes router performance across a broad spectrum of query types.
arXiv Detail & Related papers (2025-03-20T19:52:30Z) - Labeling NIDS Rules with MITRE ATT&CK Techniques: Machine Learning vs. Large Language Models [4.440432588828829]
Large Language Models (LLMs) may be a promising technology to reduce the alert explainability gap by associating rules with attack techniques.<n>In this paper, we investigate the ability of three prominent LLMs to reason about NIDS rules while labeling them with MITRE ATT&CK tactics and techniques.<n>Our results indicate that while LLMs provide explainable, scalable, and efficient initial mappings, traditional Machine Learning (ML) models consistently outperform them in accuracy, achieving higher precision, recall, and F1-scores.
arXiv Detail & Related papers (2024-12-14T21:52:35Z) - System-Level Defense against Indirect Prompt Injection Attacks: An Information Flow Control Perspective [24.583984374370342]
Large Language Model-based systems (LLM systems) are information and query processing systems.
We present a system-level defense based on the principles of information flow control that we call an f-secure LLM system.
arXiv Detail & Related papers (2024-09-27T18:41:58Z) - Evaluating LLM-based Personal Information Extraction and Countermeasures [63.91918057570824]
Large language model (LLM) based personal information extraction can be benchmarked.
LLM can be misused by attackers to accurately extract various personal information from personal profiles.
prompt injection can defend against strong LLM-based attacks, reducing the attack to less effective traditional ones.
arXiv Detail & Related papers (2024-08-14T04:49:30Z) - Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress.
Our investigation exposes a critical oversight in this belief.
By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z) - TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data [73.29220562541204]
We consider harnessing the amazing power of language models (LLMs) to solve our task.
We develop a TAT-LLM language model by fine-tuning LLaMA 2 with the training data generated automatically from existing expert-annotated datasets.
arXiv Detail & Related papers (2024-01-24T04:28:50Z) - Advancing TTP Analysis: Harnessing the Power of Large Language Models with Retrieval Augmented Generation [1.2289361708127877]
It is unclear how Large Language Models (LLMs) can be used in an efficient and proper way to provide accurate responses for critical domains such as cybersecurity.
This work studies and compares the uses of supervised fine-tuning (SFT) of encoder-only LLMs vs. Retrieval Augmented Generation (RAG) for decoder-only LLMs.
Our studies show decoder-only LLMs with RAG achieves better performance than encoder-only models with SFT.
arXiv Detail & Related papers (2023-12-30T16:56:24Z) - Secure Instruction and Data-Level Information Flow Tracking Model for RISC-V [0.0]
Unauthorized access, fault injection, and privacy invasion are potential threats from untrusted actors.
We propose an integrated Information Flow Tracking (IFT) technique to enable runtime security to protect system integrity.
This study proposes a multi-level IFT model that integrates a hardware-based IFT technique with a gate-level-based IFT (GLIFT) technique.
arXiv Detail & Related papers (2023-11-17T02:04:07Z) - Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs [59.596335292426105]
This paper collects the first open-source dataset to evaluate safeguards in large language models.
We train several BERT-like classifiers to achieve results comparable with GPT-4 on automatic safety evaluation.
arXiv Detail & Related papers (2023-08-25T14:02:12Z) - On the Uses of Large Language Models to Interpret Ambiguous Cyberattack
Descriptions [1.6317061277457001]
Tactics, Techniques, and Procedures (TTPs) are to describe how and why attackers exploit vulnerabilities.
A TTP description written by one security professional can be interpreted very differently by another, leading to confusion in cybersecurity operations.
Advancements in AI have led to the increasing use of Natural Language Processing (NLP) algorithms to assist the various tasks in cyber operations.
arXiv Detail & Related papers (2023-06-24T21:08:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.