Knowledge mining of unstructured information: application to
cyber-domain
- URL: http://arxiv.org/abs/2109.03848v2
- Date: Fri, 10 Sep 2021 06:38:16 GMT
- Title: Knowledge mining of unstructured information: application to
cyber-domain
- Authors: Tuomas Takko, Kunal Bhattacharya, Martti Lehto, Pertti Jalasvirta,
Aapo Cederberg, Kimmo Kaski
- Abstract summary: We present and implement a novel knowledge graph and knowledge mining framework for extracting relevant information from free-form text about incidents in the cyber domain.
Our framework includes a machine learning based pipeline as well as crawling methods for generating graphs of entities, attackers and the related information.
We test our framework on publicly available cyber incident datasets to evaluate the accuracy of our knowledge mining methods as well as the usefulness of the framework in the use of cyber analysts.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cyber intelligence is widely and abundantly available in numerous open online
sources with reports on vulnerabilities and incidents. This constant stream of
noisy information requires new tools and techniques if it is to be used for the
benefit of analysts and investigators in various organizations. In this paper
we present and implement a novel knowledge graph and knowledge mining framework
for extracting relevant information from free-form text about incidents in the
cyber domain. Our framework includes a machine learning based pipeline as well
as crawling methods for generating graphs of entities, attackers and the
related information with our non-technical cyber ontology. We test our
framework on publicly available cyber incident datasets to evaluate the
accuracy of our knowledge mining methods as well as the usefulness of the
framework in the use of cyber analysts. Our results show analyzing the
knowledge graph constructed using the novel framework, an analyst can infer
additional information from the current cyber landscape in terms of risk to
various entities and the propagation of risk between industries and countries.
Expanding the framework to accommodate more technical and operational level
information can increase the accuracy and explainability of trends and risk in
the knowledge graph.
Related papers
- CTINEXUS: Leveraging Optimized LLM In-Context Learning for Constructing Cybersecurity Knowledge Graphs Under Data Scarcity [49.657358248788945]
Textual descriptions in cyber threat intelligence (CTI) reports are rich sources of knowledge about cyber threats.
Current CTI extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction.
We propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models.
arXiv Detail & Related papers (2024-10-28T14:18:32Z) - Customized Information and Domain-centric Knowledge Graph Construction with Large Language Models [0.0]
We propose a novel approach based on knowledge graphs to provide timely access to structured information.
Our framework encompasses a text mining process, which includes information retrieval, keyphrase extraction, semantic network creation, and topic map visualization.
We apply our methodology to the domain of automotive electrical systems to demonstrate the approach, which is scalable.
arXiv Detail & Related papers (2024-09-30T07:08:28Z) - Private Knowledge Sharing in Distributed Learning: A Survey [50.51431815732716]
The rise of Artificial Intelligence has revolutionized numerous industries and transformed the way society operates.
It is crucial to utilize information in learning processes that are either distributed or owned by different entities.
Modern data-driven services have been developed to integrate distributed knowledge entities into their outcomes.
arXiv Detail & Related papers (2024-02-08T07:18:23Z) - Privacy-Preserving Graph Machine Learning from Data to Computation: A
Survey [67.7834898542701]
We focus on reviewing privacy-preserving techniques of graph machine learning.
We first review methods for generating privacy-preserving graph data.
Then we describe methods for transmitting privacy-preserved information.
arXiv Detail & Related papers (2023-07-10T04:30:23Z) - Constructing a Knowledge Graph from Textual Descriptions of Software
Vulnerabilities in the National Vulnerability Database [3.0724051098062097]
We present a new method for constructing a vulnerability knowledge graph from information in the National Database (NVD)
Our approach combines named entity recognition (NER), relation extraction (RE), and entity prediction using a combination of neural models, rules, and knowledge graph embeddings.
We demonstrate how our method helps to fix missing entities in knowledge graphs used for cybersecurity and evaluate the performance.
arXiv Detail & Related papers (2023-04-30T04:23:40Z) - Graph Mining for Cybersecurity: A Survey [61.505995908021525]
The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society.
Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities.
With the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance.
arXiv Detail & Related papers (2023-04-02T08:43:03Z) - An energy-based model for neuro-symbolic reasoning on knowledge graphs [0.0]
We propose an energy-based graph embedding algorithm to characterize industrial automation systems.
By combining knowledge from multiple domains, the learned model is capable of making context-aware predictions.
The presented model is mappable to a biologically-inspired neural architecture, serving as a first bridge between graph embedding methods and neuromorphic computing.
arXiv Detail & Related papers (2021-10-04T18:02:36Z) - A Crawler Architecture for Harvesting the Clear, Social, and Dark Web
for IoT-Related Cyber-Threat Intelligence [1.1661238776379117]
The clear, social, and dark web have lately been identified as rich sources of valuable cyber-security information.
We present a novel crawling architecture for transparently harvesting data from security websites in the clear web, security forums in the social web, and hacker forums/marketplaces in the dark web.
arXiv Detail & Related papers (2021-09-14T19:26:08Z) - Machine learning on knowledge graphs for context-aware security
monitoring [0.0]
We discuss the application of machine learning on knowledge graphs for intrusion detection.
We experimentally evaluate a link-prediction method for scoring anomalous activity in industrial systems.
The proposed method is shown to produce intuitively well-calibrated and interpretable alerts in a diverse range of scenarios.
arXiv Detail & Related papers (2021-05-18T18:00:19Z) - Information Obfuscation of Graph Neural Networks [96.8421624921384]
We study the problem of protecting sensitive attributes by information obfuscation when learning with graph structured data.
We propose a framework to locally filter out pre-determined sensitive attributes via adversarial training with the total variation and the Wasserstein distance.
arXiv Detail & Related papers (2020-09-28T17:55:04Z) - Survey of Network Intrusion Detection Methods from the Perspective of
the Knowledge Discovery in Databases Process [63.75363908696257]
We review the methods that have been applied to network data with the purpose of developing an intrusion detector.
We discuss the techniques used for the capture, preparation and transformation of the data, as well as, the data mining and evaluation methods.
As a result of this literature review, we investigate some open issues which will need to be considered for further research in the area of network security.
arXiv Detail & Related papers (2020-01-27T11:21:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.