Automatic Mapping of Unstructured Cyber Threat Intelligence: An
Experimental Study
- URL: http://arxiv.org/abs/2208.12144v1
- Date: Thu, 25 Aug 2022 15:01:42 GMT
- Title: Automatic Mapping of Unstructured Cyber Threat Intelligence: An
Experimental Study
- Authors: Vittorio Orbinato, Mariarosaria Barbaraci, Roberto Natella, Domenico
Cotroneo
- Abstract summary: We present an experimental study on the automatic classification of unstructured Cyber Threat Intelligence (CTI) into attack techniques using machine learning (ML)
We contribute with two new datasets for CTI analysis, and we evaluate several ML models, including both traditional and deep learning-based ones.
We present several lessons learned about how ML can perform at this task, which classifiers perform best and under which conditions, which are the main causes of classification errors, and the challenges ahead for CTI analysis.
- Score: 1.1470070927586016
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Proactive approaches to security, such as adversary emulation, leverage
information about threat actors and their techniques (Cyber Threat
Intelligence, CTI). However, most CTI still comes in unstructured forms (i.e.,
natural language), such as incident reports and leaked documents. To support
proactive security efforts, we present an experimental study on the automatic
classification of unstructured CTI into attack techniques using machine
learning (ML). We contribute with two new datasets for CTI analysis, and we
evaluate several ML models, including both traditional and deep learning-based
ones. We present several lessons learned about how ML can perform at this task,
which classifiers perform best and under which conditions, which are the main
causes of classification errors, and the challenges ahead for CTI analysis.
Related papers
- CTINEXUS: Leveraging Optimized LLM In-Context Learning for Constructing Cybersecurity Knowledge Graphs Under Data Scarcity [49.657358248788945]
Textual descriptions in cyber threat intelligence (CTI) reports are rich sources of knowledge about cyber threats.
Current CTI extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction.
We propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models.
arXiv Detail & Related papers (2024-10-28T14:18:32Z) - CTISum: A New Benchmark Dataset For Cyber Threat Intelligence Summarization [14.287652216484863]
We present CTISum, a new benchmark for CTI summarization task.
Considering the importance of attack process, a novel fine-grained subtask of attack process summarization is proposed.
arXiv Detail & Related papers (2024-08-13T02:25:16Z) - Verification of Machine Unlearning is Fragile [48.71651033308842]
We introduce two novel adversarial unlearning processes capable of circumventing both types of verification strategies.
This study highlights the vulnerabilities and limitations in machine unlearning verification, paving the way for further research into the safety of machine unlearning.
arXiv Detail & Related papers (2024-08-01T21:37:10Z) - Actionable Cyber Threat Intelligence using Knowledge Graphs and Large Language Models [0.8192907805418583]
Microsoft, Trend Micro, and CrowdStrike are using generative AI to facilitate CTI extraction.
This paper addresses the challenge of automating the extraction of actionable CTI using advancements in Large Language Models (LLMs) and Knowledge Graphs (KGs)
Our methodology evaluates techniques such as prompt engineering, the guidance framework, and fine-tuning to optimize information extraction and structuring.
Experimental results demonstrate the effectiveness of our approach in extracting relevant information, with guidance and fine-tuning showing superior performance over prompt engineering.
arXiv Detail & Related papers (2024-06-30T13:02:03Z) - Data Poisoning for In-context Learning [49.77204165250528]
In-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks.
This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks.
We introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL.
arXiv Detail & Related papers (2024-02-03T14:20:20Z) - Towards Automated Classification of Attackers' TTPs by combining NLP
with ML Techniques [77.34726150561087]
We evaluate and compare different Natural Language Processing (NLP) and machine learning techniques used for security information extraction in research.
Based on our investigations we propose a data processing pipeline that automatically classifies unstructured text according to attackers' tactics and techniques.
arXiv Detail & Related papers (2022-07-18T09:59:21Z) - Threat Assessment in Machine Learning based Systems [12.031113181911627]
We conduct an empirical study of threats reported against Machine Learning-based systems.
The study is based on 89 real-world ML attack scenarios from the MITRE's ATLAS database, the AI Incident Database, and the literature.
Results show that convolutional neural networks were one of the most targeted models among the attack scenarios.
arXiv Detail & Related papers (2022-06-30T20:19:50Z) - Inspect, Understand, Overcome: A Survey of Practical Methods for AI
Safety [54.478842696269304]
The use of deep neural networks (DNNs) in safety-critical applications is challenging due to numerous model-inherent shortcomings.
In recent years, a zoo of state-of-the-art techniques aiming to address these safety concerns has emerged.
Our paper addresses both machine learning experts and safety engineers.
arXiv Detail & Related papers (2021-04-29T09:54:54Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Automated Retrieval of ATT&CK Tactics and Techniques for Cyber Threat
Reports [5.789368942487406]
We evaluate several classification approaches to automatically retrieve Tactics, Techniques and Procedures from unstructured text.
We present rcATT, a tool built on top of our findings and freely distributed to the security community to support cyber threat report automated analysis.
arXiv Detail & Related papers (2020-04-29T16:45:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.