Related papers: NLP-Based Techniques for Cyber Threat Intelligence

NLP-Based Techniques for Cyber Threat Intelligence

URL: http://arxiv.org/abs/2311.08807v1
Date: Wed, 15 Nov 2023 09:23:33 GMT
Title: NLP-Based Techniques for Cyber Threat Intelligence
Authors: Marco Arazzi, Dincy R. Arikkat, Serena Nicolazzo, Antonino Nocera, Rafidha Rehiman K. A., Vinod P., Mauro Conti,
Abstract summary: Survey paper provides a comprehensive overview of NLP-based techniques applied in the context of threat intelligence. It begins by describing the foundational definitions and principles of CTI as a major tool for safeguarding digital assets. It then undertakes a thorough examination of NLP-based techniques for CTI data crawling from Web sources, CTI data analysis, Relation Extraction from cybersecurity data, CTI sharing and collaboration, and security threats of CTI.
Score: 13.958337678497163
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the digital era, threat actors employ sophisticated techniques for which, often, digital traces in the form of textual data are available. Cyber Threat Intelligence~(CTI) is related to all the solutions inherent to data collection, processing, and analysis useful to understand a threat actor's targets and attack behavior. Currently, CTI is assuming an always more crucial role in identifying and mitigating threats and enabling proactive defense strategies. In this context, NLP, an artificial intelligence branch, has emerged as a powerful tool for enhancing threat intelligence capabilities. This survey paper provides a comprehensive overview of NLP-based techniques applied in the context of threat intelligence. It begins by describing the foundational definitions and principles of CTI as a major tool for safeguarding digital assets. It then undertakes a thorough examination of NLP-based techniques for CTI data crawling from Web sources, CTI data analysis, Relation Extraction from cybersecurity data, CTI sharing and collaboration, and security threats of CTI. Finally, the challenges and limitations of NLP in threat intelligence are exhaustively examined, including data quality issues and ethical considerations. This survey draws a complete framework and serves as a valuable resource for security professionals and researchers seeking to understand the state-of-the-art NLP-based threat intelligence techniques and their potential impact on cybersecurity.

Related papers

False Alarms, Real Damage: Adversarial Attacks Using LLM-based Models on Text-based Cyber Threat Intelligence Systems [1.4932549821542682]
Cyber Threat Intelligence (CTI) has emerged as a vital complementary approach that operates in the early phases of the cyber threat lifecycle.<n>Due to the large volume of data, automation through Machine Learning (ML) and Natural Language Processing (NLP) models is essential for effective CTI extraction.<n>This study investigates vulnerabilities within various components of the entire CTI pipeline and their susceptibility to adversarial attacks.
arXiv Detail & Related papers (2025-07-05T19:00:27Z)
A Threat Intelligence Event Extraction Conceptual Model for Cyber Threat Intelligence Feeds [0.0]
The efficiency of Cyber Threat Intelligence (CTI) data collection has become paramount in ensuring robust cybersecurity.<n>Existing works encounter significant challenges in preprocessing large volumes of multilingual threat data, leading to inefficiencies in real-time threat analysis.<n>This paper presents a systematic review of current techniques aimed at enhancing CTI data collection efficiency.
arXiv Detail & Related papers (2025-06-04T04:09:01Z)
A Multidisciplinary Approach to Telegram Data Analysis [0.0]
This paper presents a multidisciplinary approach to analyzing data from Telegram for early warning information regarding cyber threats. We employ a combination of neural network architectures and traditional machine learning algorithms. We aim to enhance early warning systems for cyber threats, enabling more proactive responses to potential security breaches.
arXiv Detail & Related papers (2024-12-29T09:10:52Z)
CTINEXUS: Leveraging Optimized LLM In-Context Learning for Constructing Cybersecurity Knowledge Graphs Under Data Scarcity [49.657358248788945]
Textual descriptions in cyber threat intelligence (CTI) reports are rich sources of knowledge about cyber threats. Current CTI extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction. We propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models.
arXiv Detail & Related papers (2024-10-28T14:18:32Z)
KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment [38.312774244521]
We propose a knowledge graph-based verifier for Cyber Threat Intelligence (CTI) quality assessment framework. Our approach introduces Large Language Models (LLMs) to automatically extract OSCTI key claims to be verified. To fill the gap in the research field, we created and made public the first dataset for threat intelligence assessment from heterogeneous sources.
arXiv Detail & Related papers (2024-08-15T11:32:46Z)
Graph Mining for Cybersecurity: A Survey [61.505995908021525]
The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society. Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities. With the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance.
arXiv Detail & Related papers (2023-04-02T08:43:03Z)
ThreatKG: An AI-Powered System for Automated Open-Source Cyber Threat Intelligence Gathering and Management [65.0114141380651]
ThreatKG is an automated system for OSCTI gathering and management. It efficiently collects a large number of OSCTI reports from multiple sources. It uses specialized AI-based techniques to extract high-quality knowledge about various threat entities.
arXiv Detail & Related papers (2022-12-20T16:13:59Z)
Recognizing and Extracting Cybersecurtity-relevant Entities from Text [1.7499351967216343]
Cyber Threat Intelligence (CTI) is information describing threat vectors, vulnerabilities, and attacks. CTI is often used as training data for AI-based cyber defense systems such as Cybersecurity Knowledge Graphs (CKG)
arXiv Detail & Related papers (2022-08-02T18:44:06Z)
Towards Automated Classification of Attackers' TTPs by combining NLP with ML Techniques [77.34726150561087]
We evaluate and compare different Natural Language Processing (NLP) and machine learning techniques used for security information extraction in research. Based on our investigations we propose a data processing pipeline that automatically classifies unstructured text according to attackers' tactics and techniques.
arXiv Detail & Related papers (2022-07-18T09:59:21Z)
Generating Cyber Threat Intelligence to Discover Potential Security Threats Using Classification and Topic Modeling [6.0897744845912865]
Cyber Threat Intelligence (CTI) has been represented as one of the proactive and robust mechanisms. Our goal is to identify and explore relevant CTI from hacker forums by using different supervised and unsupervised learning techniques.
arXiv Detail & Related papers (2021-08-16T02:30:29Z)
A System for Automated Open-Source Threat Intelligence Gathering and Management [53.65687495231605]
SecurityKG is a system for automated OSCTI gathering and management. It uses a combination of AI and NLP techniques to extract high-fidelity knowledge about threat behaviors.
arXiv Detail & Related papers (2021-01-19T18:31:35Z)
A System for Efficiently Hunting for Cyber Threats in Computer Systems Using Threat Intelligence [78.23170229258162]
We build ThreatRaptor, a system that facilitates cyber threat hunting in computer systems using OSCTI. ThreatRaptor provides (1) an unsupervised, light-weight, and accurate NLP pipeline that extracts structured threat behaviors from unstructured OSCTI text, (2) a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities, and (3) a query synthesis mechanism that automatically synthesizes a TBQL query from the extracted threat behaviors.
arXiv Detail & Related papers (2021-01-17T19:44:09Z)
Automated Retrieval of ATT&CK Tactics and Techniques for Cyber Threat Reports [5.789368942487406]
We evaluate several classification approaches to automatically retrieve Tactics, Techniques and Procedures from unstructured text. We present rcATT, a tool built on top of our findings and freely distributed to the security community to support cyber threat report automated analysis.
arXiv Detail & Related papers (2020-04-29T16:45:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.