Cybersecurity Threat Hunting and Vulnerability Analysis Using a Neo4j Graph Database of Open Source Intelligence
- URL: http://arxiv.org/abs/2301.12013v2
- Date: Mon, 07 Oct 2024 21:15:40 GMT
- Title: Cybersecurity Threat Hunting and Vulnerability Analysis Using a Neo4j Graph Database of Open Source Intelligence
- Authors: Elijah Pelofske, Lorie M. Liebrock, Vincent Urias,
- Abstract summary: We present a system which constructs a Neo4j graph database formed by shared connections between open source intelligence text and other information.
These connections are comprised of possible indicators of compromise (e.g., IP addresses, domains, hashes, email addresses, phone numbers) and information on known exploits and techniques.
We show three specific examples of interesting connections found in the graph database; the connections to a known exploited CVE, a known malicious IP address, and a malware hash signature.
- Score: 0.8192907805418583
- License:
- Abstract: Open source intelligence is a powerful tool for cybersecurity analysts to gather information both for analysis of discovered vulnerabilities and for detecting novel cybersecurity threats and exploits. However the scale of information that is relevant for information security on the internet is always increasing, and is intractable for analysts to parse comprehensively. Therefore methods of condensing the available open source intelligence, and automatically developing connections between disparate sources of information, is incredibly valuable. In this research, we present a system which constructs a Neo4j graph database formed by shared connections between open source intelligence text including blogs, cybersecurity bulletins, news sites, antivirus scans, social media posts (e.g., Reddit and Twitter), and threat reports. These connections are comprised of possible indicators of compromise (e.g., IP addresses, domains, hashes, email addresses, phone numbers), information on known exploits and techniques (e.g., CVEs and MITRE ATT&CK Technique ID's), and potential sources of information on cybersecurity exploits such as twitter usernames. The construction of the database of potential IoCs is detailed, including the addition of machine learning and metadata which can be used for filtering of the data for a specific domain (for example a specific natural language) when needed. Examples of utilizing the graph database for querying connections between known malicious IoCs and open source intelligence documents, including threat reports, are shown. We show three specific examples of interesting connections found in the graph database; the connections to a known exploited CVE, a known malicious IP address, and a malware hash signature.
Related papers
- Enhancing Supply Chain Visibility with Generative AI: An Exploratory Case Study on Relationship Prediction in Knowledge Graphs [52.79646338275159]
Relationship prediction aims to increase the visibility of supply chains using data-driven techniques.
Existing methods have been successful for predicting relationships but struggle to extract the context in which these relationships are embedded.
Lack of context prevents practitioners from distinguishing transactional relations from established supply chain relations.
arXiv Detail & Related papers (2024-12-04T15:19:01Z) - A Multi-Functional Web Tool for Comprehensive Threat Detection Through IP Address Analysis [0.412484724941528]
This work introduces a comprehensive web tool for advanced IP address characterisation.
Our tool offers a wide range of features, including geolocation, blocklist check, VPN detection, proxy detection, bot detection, Tor detection, port scan, and accurate domain statistics.
Our tool supports domain names and IPv4 addresses, making it a multi-functional and powerful IP analyser tool for threat intelligence.
arXiv Detail & Related papers (2024-12-04T04:29:12Z) - CTINEXUS: Leveraging Optimized LLM In-Context Learning for Constructing Cybersecurity Knowledge Graphs Under Data Scarcity [49.657358248788945]
Textual descriptions in cyber threat intelligence (CTI) reports are rich sources of knowledge about cyber threats.
Current CTI extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction.
We propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models.
arXiv Detail & Related papers (2024-10-28T14:18:32Z) - KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment [38.312774244521]
We propose a knowledge graph-based verifier for Cyber Threat Intelligence (CTI) quality assessment framework.
Our approach introduces Large Language Models (LLMs) to automatically extract OSCTI key claims to be verified.
To fill the gap in the research field, we created and made public the first dataset for threat intelligence assessment from heterogeneous sources.
arXiv Detail & Related papers (2024-08-15T11:32:46Z) - "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models [74.05368440735468]
Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs)
In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases.
arXiv Detail & Related papers (2024-06-26T05:36:23Z) - Graph Mining for Cybersecurity: A Survey [61.505995908021525]
The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society.
Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities.
With the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance.
arXiv Detail & Related papers (2023-04-02T08:43:03Z) - Exploring the Limits of Transfer Learning with Unified Model in the
Cybersecurity Domain [17.225973170682604]
We introduce a generative multi-task model, Unified Text-to-Text Cybersecurity (UTS)
UTS is trained on malware reports, phishing site URLs, programming code constructs, social media data, blogs, news articles, and public forum posts.
We show UTS improves the performance of some cybersecurity datasets.
arXiv Detail & Related papers (2023-02-20T22:21:26Z) - ThreatKG: An AI-Powered System for Automated Open-Source Cyber Threat Intelligence Gathering and Management [65.0114141380651]
ThreatKG is an automated system for OSCTI gathering and management.
It efficiently collects a large number of OSCTI reports from multiple sources.
It uses specialized AI-based techniques to extract high-quality knowledge about various threat entities.
arXiv Detail & Related papers (2022-12-20T16:13:59Z) - Recognizing and Extracting Cybersecurtity-relevant Entities from Text [1.7499351967216343]
Cyber Threat Intelligence (CTI) is information describing threat vectors, vulnerabilities, and attacks.
CTI is often used as training data for AI-based cyber defense systems such as Cybersecurity Knowledge Graphs (CKG)
arXiv Detail & Related papers (2022-08-02T18:44:06Z) - A System for Automated Open-Source Threat Intelligence Gathering and
Management [53.65687495231605]
SecurityKG is a system for automated OSCTI gathering and management.
It uses a combination of AI and NLP techniques to extract high-fidelity knowledge about threat behaviors.
arXiv Detail & Related papers (2021-01-19T18:31:35Z) - MALOnt: An Ontology for Malware Threat Intelligence [19.57441168490977]
Malware threat intelligence uncovers deep information about malware, threat actors, and their tactics.
MALOnt allows structured extraction of information and knowledge graph generation.
arXiv Detail & Related papers (2020-06-20T00:25:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.