NLP-Based Techniques for Cyber Threat Intelligence
- URL: http://arxiv.org/abs/2311.08807v1
- Date: Wed, 15 Nov 2023 09:23:33 GMT
- Title: NLP-Based Techniques for Cyber Threat Intelligence
- Authors: Marco Arazzi, Dincy R. Arikkat, Serena Nicolazzo, Antonino Nocera, Rafidha Rehiman K. A., Vinod P., Mauro Conti,
- Abstract summary: Survey paper provides a comprehensive overview of NLP-based techniques applied in the context of threat intelligence.
It begins by describing the foundational definitions and principles of CTI as a major tool for safeguarding digital assets.
It then undertakes a thorough examination of NLP-based techniques for CTI data crawling from Web sources, CTI data analysis, Relation Extraction from cybersecurity data, CTI sharing and collaboration, and security threats of CTI.
- Score: 13.958337678497163
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the digital era, threat actors employ sophisticated techniques for which, often, digital traces in the form of textual data are available. Cyber Threat Intelligence~(CTI) is related to all the solutions inherent to data collection, processing, and analysis useful to understand a threat actor's targets and attack behavior. Currently, CTI is assuming an always more crucial role in identifying and mitigating threats and enabling proactive defense strategies. In this context, NLP, an artificial intelligence branch, has emerged as a powerful tool for enhancing threat intelligence capabilities. This survey paper provides a comprehensive overview of NLP-based techniques applied in the context of threat intelligence. It begins by describing the foundational definitions and principles of CTI as a major tool for safeguarding digital assets. It then undertakes a thorough examination of NLP-based techniques for CTI data crawling from Web sources, CTI data analysis, Relation Extraction from cybersecurity data, CTI sharing and collaboration, and security threats of CTI. Finally, the challenges and limitations of NLP in threat intelligence are exhaustively examined, including data quality issues and ethical considerations. This survey draws a complete framework and serves as a valuable resource for security professionals and researchers seeking to understand the state-of-the-art NLP-based threat intelligence techniques and their potential impact on cybersecurity.
Related papers
- CTINEXUS: Leveraging Optimized LLM In-Context Learning for Constructing Cybersecurity Knowledge Graphs Under Data Scarcity [49.657358248788945]
Textual descriptions in cyber threat intelligence (CTI) reports are rich sources of knowledge about cyber threats.
Current CTI extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction.
We propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models.
arXiv Detail & Related papers (2024-10-28T14:18:32Z) - KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment [38.312774244521]
We propose a knowledge graph-based verifier for Cyber Threat Intelligence (CTI) quality assessment framework.
Our approach introduces Large Language Models (LLMs) to automatically extract OSCTI key claims to be verified.
To fill the gap in the research field, we created and made public the first dataset for threat intelligence assessment from heterogeneous sources.
arXiv Detail & Related papers (2024-08-15T11:32:46Z) - Graph Mining for Cybersecurity: A Survey [61.505995908021525]
The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society.
Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities.
With the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance.
arXiv Detail & Related papers (2023-04-02T08:43:03Z) - ThreatKG: An AI-Powered System for Automated Open-Source Cyber Threat Intelligence Gathering and Management [65.0114141380651]
ThreatKG is an automated system for OSCTI gathering and management.
It efficiently collects a large number of OSCTI reports from multiple sources.
It uses specialized AI-based techniques to extract high-quality knowledge about various threat entities.
arXiv Detail & Related papers (2022-12-20T16:13:59Z) - Recognizing and Extracting Cybersecurtity-relevant Entities from Text [1.7499351967216343]
Cyber Threat Intelligence (CTI) is information describing threat vectors, vulnerabilities, and attacks.
CTI is often used as training data for AI-based cyber defense systems such as Cybersecurity Knowledge Graphs (CKG)
arXiv Detail & Related papers (2022-08-02T18:44:06Z) - Towards Automated Classification of Attackers' TTPs by combining NLP
with ML Techniques [77.34726150561087]
We evaluate and compare different Natural Language Processing (NLP) and machine learning techniques used for security information extraction in research.
Based on our investigations we propose a data processing pipeline that automatically classifies unstructured text according to attackers' tactics and techniques.
arXiv Detail & Related papers (2022-07-18T09:59:21Z) - Generating Cyber Threat Intelligence to Discover Potential Security
Threats Using Classification and Topic Modeling [6.0897744845912865]
Cyber Threat Intelligence (CTI) has been represented as one of the proactive and robust mechanisms.
Our goal is to identify and explore relevant CTI from hacker forums by using different supervised and unsupervised learning techniques.
arXiv Detail & Related papers (2021-08-16T02:30:29Z) - A System for Automated Open-Source Threat Intelligence Gathering and
Management [53.65687495231605]
SecurityKG is a system for automated OSCTI gathering and management.
It uses a combination of AI and NLP techniques to extract high-fidelity knowledge about threat behaviors.
arXiv Detail & Related papers (2021-01-19T18:31:35Z) - A System for Efficiently Hunting for Cyber Threats in Computer Systems
Using Threat Intelligence [78.23170229258162]
We build ThreatRaptor, a system that facilitates cyber threat hunting in computer systems using OSCTI.
ThreatRaptor provides (1) an unsupervised, light-weight, and accurate NLP pipeline that extracts structured threat behaviors from unstructured OSCTI text, (2) a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities, and (3) a query synthesis mechanism that automatically synthesizes a TBQL query from the extracted threat behaviors.
arXiv Detail & Related papers (2021-01-17T19:44:09Z) - Automated Retrieval of ATT&CK Tactics and Techniques for Cyber Threat
Reports [5.789368942487406]
We evaluate several classification approaches to automatically retrieve Tactics, Techniques and Procedures from unstructured text.
We present rcATT, a tool built on top of our findings and freely distributed to the security community to support cyber threat report automated analysis.
arXiv Detail & Related papers (2020-04-29T16:45:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.