LLMCloudHunter: Harnessing LLMs for Automated Extraction of Detection Rules from Cloud-Based CTI
- URL: http://arxiv.org/abs/2407.05194v1
- Date: Sat, 6 Jul 2024 21:43:35 GMT
- Title: LLMCloudHunter: Harnessing LLMs for Automated Extraction of Detection Rules from Cloud-Based CTI
- Authors: Yuval Schwartz, Lavi Benshimol, Dudu Mimran, Yuval Elovici, Asaf Shabtai,
- Abstract summary: Open-source cyber threat intelligence (OS-CTI) is a valuable resource for threat hunters.
Previous studies aimed at automating OSCTI analysis failed to provide actionable outputs.
We propose LLMCloudHunter, a novel framework that automatically generates generic-signature detection rule candidates from OSCTI data.
- Score: 24.312198733476063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As the number and sophistication of cyber attacks have increased, threat hunting has become a critical aspect of active security, enabling proactive detection and mitigation of threats before they cause significant harm. Open-source cyber threat intelligence (OS-CTI) is a valuable resource for threat hunters, however, it often comes in unstructured formats that require further manual analysis. Previous studies aimed at automating OSCTI analysis are limited since (1) they failed to provide actionable outputs, (2) they did not take advantage of images present in OSCTI sources, and (3) they focused on on-premises environments, overlooking the growing importance of cloud environments. To address these gaps, we propose LLMCloudHunter, a novel framework that leverages large language models (LLMs) to automatically generate generic-signature detection rule candidates from textual and visual OSCTI data. We evaluated the quality of the rules generated by the proposed framework using 12 annotated real-world cloud threat reports. The results show that our framework achieved a precision of 92% and recall of 98% for the task of accurately extracting API calls made by the threat actor and a precision of 99% with a recall of 98% for IoCs. Additionally, 99.18% of the generated detection rule candidates were successfully compiled and converted into Splunk queries.
Related papers
- Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses.
Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives.
The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z) - A Relevance Model for Threat-Centric Ranking of Cybersecurity Vulnerabilities [0.29998889086656577]
The relentless process of tracking and remediating vulnerabilities is a top concern for cybersecurity professionals.
We provide a framework for vulnerability management specifically focused on mitigating threats using adversary criteria derived from MITRE ATT&CK.
Our results show an average 71.5% - 91.3% improvement towards the identification of vulnerabilities likely to be targeted and exploited by cyber threat actors.
arXiv Detail & Related papers (2024-06-09T23:29:12Z) - DPP-Based Adversarial Prompt Searching for Lanugage Models [56.73828162194457]
Auto-regressive Selective Replacement Ascent (ASRA) is a discrete optimization algorithm that selects prompts based on both quality and similarity with determinantal point process (DPP)
Experimental results on six different pre-trained language models demonstrate the efficacy of ASRA for eliciting toxic content.
arXiv Detail & Related papers (2024-03-01T05:28:06Z) - Forcing Generative Models to Degenerate Ones: The Power of Data
Poisoning Attacks [10.732558183444985]
Malicious actors can covertly exploit large language models (LLMs) vulnerabilities through poisoning attacks aimed at generating undesirable outputs.
This paper explores various poisoning techniques to assess their effectiveness across a range of generative tasks.
We show that it is possible to successfully poison an LLM during the fine-tuning stage using as little as 1% of the total tuning data samples.
arXiv Detail & Related papers (2023-12-07T23:26:06Z) - Risk-optimized Outlier Removal for Robust 3D Point Cloud Classification [54.286437930350445]
This paper highlights the challenges of point cloud classification posed by various forms of noise.
We introduce an innovative point outlier cleansing method that harnesses the power of downstream classification models.
Our proposed technique not only robustly filters diverse point cloud outliers but also consistently and significantly enhances existing robust methods for point cloud classification.
arXiv Detail & Related papers (2023-07-20T13:47:30Z) - On the Security Risks of Knowledge Graph Reasoning [71.64027889145261]
We systematize the security threats to KGR according to the adversary's objectives, knowledge, and attack vectors.
We present ROAR, a new class of attacks that instantiate a variety of such threats.
We explore potential countermeasures against ROAR, including filtering of potentially poisoning knowledge and training with adversarially augmented queries.
arXiv Detail & Related papers (2023-05-03T18:47:42Z) - Unsupervised User-Based Insider Threat Detection Using Bayesian Gaussian
Mixture Models [0.0]
In this paper, we propose an unsupervised insider threat detection system based on audit data.
The proposed approach leverages a user-based model to optimize specific behaviors modelization and an automatic feature extraction system based on Word2Vec.
Results indicate that the proposed method competes with state-of-the-art approaches, presenting a good recall of 88%, accuracy and true negative rate of 93%, and a false positive rate of 6.9%.
arXiv Detail & Related papers (2022-11-23T13:45:19Z) - A System for Efficiently Hunting for Cyber Threats in Computer Systems
Using Threat Intelligence [78.23170229258162]
We build ThreatRaptor, a system that facilitates cyber threat hunting in computer systems using OSCTI.
ThreatRaptor provides (1) an unsupervised, light-weight, and accurate NLP pipeline that extracts structured threat behaviors from unstructured OSCTI text, (2) a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities, and (3) a query synthesis mechanism that automatically synthesizes a TBQL query from the extracted threat behaviors.
arXiv Detail & Related papers (2021-01-17T19:44:09Z) - Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence [94.94833077653998]
ThreatRaptor is a system that facilitates threat hunting in computer systems using open-source Cyber Threat Intelligence (OSCTI)
It extracts structured threat behaviors from unstructured OSCTI text and uses a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities.
Evaluations on a broad set of attack cases demonstrate the accuracy and efficiency of ThreatRaptor in practical threat hunting.
arXiv Detail & Related papers (2020-10-26T14:54:01Z) - Phishing URL Detection Through Top-level Domain Analysis: A Descriptive
Approach [3.494620587853103]
This study aims to develop a machine-learning model to detect fraudulent URLs which can be used within the Splunk platform.
Inspired from similar approaches in the literature, we trained the SVM and Random Forests algorithms using malicious and benign datasets.
We evaluated the algorithms' performance with precision and recall, reaching up to 85% precision and 87% recall in the case of Random Forests.
arXiv Detail & Related papers (2020-05-13T21:41:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.