OntoEnricher: A Deep Learning Approach for Ontology Enrichment from
Unstructured Text
- URL: http://arxiv.org/abs/2102.04081v1
- Date: Mon, 8 Feb 2021 09:43:05 GMT
- Title: OntoEnricher: A Deep Learning Approach for Ontology Enrichment from
Unstructured Text
- Authors: Lalit Mohan Sanagavarapu, Vivek Iyer and Y Raghu Reddy
- Abstract summary: Existing information on vulnerabilities, controls, and advisories available on the web provides an opportunity to represent knowledge and perform analytics to mitigate some of the concerns.
This necessitates dynamic and automated enrichment of information security.
Existing ontology enrichment algorithms based on natural processing and ML models have issues with the contextual extraction of concepts in words, phrases and sentences.
- Score: 2.707154152696381
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Information Security in the cyber world is a major cause for concern, with
significant increase in the number of attack surfaces. Existing information on
vulnerabilities, attacks, controls, and advisories available on the web
provides an opportunity to represent knowledge and perform security analytics
to mitigate some of the concerns. Representing security knowledge in the form
of ontology facilitates anomaly detection, threat intelligence, reasoning and
relevance attribution of attacks, and many more. This necessitates dynamic and
automated enrichment of information security ontologies. However, existing
ontology enrichment algorithms based on natural language processing and ML
models have issues with the contextual extraction of concepts in words, phrases
and sentences. This motivates the need for sequential Deep Learning
architectures that traverse through dependency paths in text and extract
embedded vulnerabilities, threats, controls, products and other security
related concepts and instances from learned path representations. In the
proposed approach, Bidirectional LSTMs trained on a large DBpedia dataset and
Wikipedia corpus of 2.8 GB along with Universal Sentence Encoder was deployed
to enrich ISO 27001 based information security ontology. The approach yielded a
test accuracy of over 80\% when tested with knocked out concepts from ontology
and web page instances to validate the robustness.
Related papers
- CTINEXUS: Leveraging Optimized LLM In-Context Learning for Constructing Cybersecurity Knowledge Graphs Under Data Scarcity [49.657358248788945]
Textual descriptions in cyber threat intelligence (CTI) reports are rich sources of knowledge about cyber threats.
Current CTI extraction methods lack flexibility and generalizability, often resulting in inaccurate and incomplete knowledge extraction.
We propose CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models.
arXiv Detail & Related papers (2024-10-28T14:18:32Z) - Enhancing Code Vulnerability Detection via Vulnerability-Preserving Data Augmentation [29.72520866016839]
Source code vulnerability detection aims to identify inherent vulnerabilities to safeguard software systems from potential attacks.
Many prior studies overlook diverse vulnerability characteristics, simplifying the problem into a binary (0-1) classification task.
FGVulDet employs multiple classifiers to discern characteristics of various vulnerability types and combines their outputs to identify the specific type of vulnerability.
FGVulDet is trained on a large-scale dataset from GitHub, encompassing five different types of vulnerabilities.
arXiv Detail & Related papers (2024-04-15T09:10:52Z) - Dynamic Neural Control Flow Execution: An Agent-Based Deep Equilibrium Approach for Binary Vulnerability Detection [4.629503670145618]
Software vulnerabilities are a challenge in cybersecurity.
DeepEXE is an agent-based implicit neural network that mimics the execution path of a program.
We show that DeepEXE is an accurate and efficient method and outperforms the state-of-the-art vulnerability detection methods.
arXiv Detail & Related papers (2024-04-03T22:07:50Z) - HW-V2W-Map: Hardware Vulnerability to Weakness Mapping Framework for
Root Cause Analysis with GPT-assisted Mitigation Suggestion [3.847218857469107]
We presentHW-V2W-Map Framework, which is a Machine Learning (ML) framework focusing on hardware vulnerabilities and Internet of Things (IoT) security.
The architecture that we have proposed incorporates an Ontology-driven Storytelling framework, which automates the process of updating the Ontology.
Our proposed framework utilized Generative Pre-trained Transformer (GPT) Large Language Models (LLMs) to provide mitigation suggestions.
arXiv Detail & Related papers (2023-12-21T02:14:41Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Towards Automated Classification of Attackers' TTPs by combining NLP
with ML Techniques [77.34726150561087]
We evaluate and compare different Natural Language Processing (NLP) and machine learning techniques used for security information extraction in research.
Based on our investigations we propose a data processing pipeline that automatically classifies unstructured text according to attackers' tactics and techniques.
arXiv Detail & Related papers (2022-07-18T09:59:21Z) - Multi-features based Semantic Augmentation Networks for Named Entity
Recognition in Threat Intelligence [7.321994923276344]
We propose a semantic augmentation method which incorporates different linguistic features to enrich the representation of input tokens.
In particular, we encode and aggregate the constituent feature, morphological feature and part of speech feature for each input token to improve the robustness of the method.
We have conducted experiments on the cybersecurity datasets DNRTI and MalwareTextDB, and the results demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-07-01T06:55:12Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - A Deep Learning Approach for Ontology Enrichment from Unstructured Text [2.932750332087746]
Existing information vulnerabilities on attacks, controls, and advisories available on the web provide an opportunity to represent and perform security analytics.
Ontology enrichment algorithms based on natural language processing and ML models have issues with contextual extraction of concepts in words, phrases, and sentences.
Bidirectional LSTMs trained on a large DB dataset and Wikipedia corpus of 2.8 GB along with Universal Sentence is deployed to enrich ISO-based information security.
arXiv Detail & Related papers (2021-12-16T01:32:21Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.