Related papers: ThreatZoom: CVE2CWE using Hierarchical Neural Network

ThreatZoom: CVE2CWE using Hierarchical Neural Network

URL: http://arxiv.org/abs/2009.11501v1
Date: Thu, 24 Sep 2020 06:04:56 GMT
Title: ThreatZoom: CVE2CWE using Hierarchical Neural Network
Authors: Ehsan Aghaei, Waseem Shadid, Ehab Al-Shaer
Abstract summary: One or more CVEs are grouped into the Common Weakness Exposureion (CWE) classes. Thousands of critical and new CVEs remain unclassified, yet they are unpatchable. This paper presents the first automatic tool to classify CVEs to CWEs.
Score: 4.254099382808598
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Common Vulnerabilities and Exposures (CVE) represent standard means for sharing publicly known information security vulnerabilities. One or more CVEs are grouped into the Common Weakness Enumeration (CWE) classes for the purpose of understanding the software or configuration flaws and potential impacts enabled by these vulnerabilities and identifying means to detect or prevent exploitation. As the CVE-to-CWE classification is mostly performed manually by domain experts, thousands of critical and new CVEs remain unclassified, yet they are unpatchable. This significantly limits the utility of CVEs and slows down proactive threat mitigation. This paper presents the first automatic tool to classify CVEs to CWEs. ThreatZoom uses a novel learning algorithm that employs an adaptive hierarchical neural network which adjusts its weights based on text analytic scores and classification errors. It automatically estimates the CWE classes corresponding to a CVE instance using both statistical and semantic features extracted from the description of a CVE. This tool is rigorously tested by various datasets provided by MITRE and the National Vulnerability Database (NVD). The accuracy of classifying CVE instances to their correct CWE classes are 92% (fine-grain) and 94% (coarse-grain) for NVD dataset, and 75% (fine-grain) and 90% (coarse-grain) for MITRE dataset, despite the small corpus.

Related papers

CPE-Identifier: Automated CPE identification and CVE summaries annotation with Deep Learning and NLP [0.28281736775010774]
We propose the CPE-Identifier system, an automated CPE annotating and extracting system, from the CVE summaries. The system can be used as a tool to identify CPE entities from new CVE text inputs. We also apply Natural Language Processing (NLP) Named Entity Recognition (NER) to identify new technical jargons in the text.
arXiv Detail & Related papers (2024-05-22T12:05:17Z)
Unveiling Hidden Links Between Unseen Security Entities [3.7138962865789353]
VulnScopper is an innovative approach that utilizes multi-modal representation learning, combining Knowledge Graphs (KG) and Natural Processing (NLP) We evaluate VulnScopper on two major security datasets, the National Vulnerability Database (NVD) and the Red Hat CVE database. Our results show that VulnScopper outperforms existing methods, achieving up to 78% Hits@10 accuracy in linking CVEs to Common Vulnerabilities and Exposures (CWEs), and Common Platform Languageions (CPEs)
arXiv Detail & Related papers (2024-03-04T13:14:39Z)
Automated CVE Analysis for Threat Prioritization and Impact Prediction [4.540236408836132]
We introduce our novel predictive model and tool (called CVEDrill) which revolutionizes CVE analysis and threat prioritization. CVEDrill accurately estimates the Common Vulnerability Scoring System (CVSS) vector for precise threat mitigation and priority ranking. It seamlessly automates the classification of CVEs into the appropriate Common Weaknession (CWE) hierarchy classes.
arXiv Detail & Related papers (2023-09-06T14:34:03Z)
CVE-driven Attack Technique Prediction with Semantic Information Extraction and a Domain-specific Language Model [2.1756081703276]
The paper introduces the TTPpredictor tool, which uses innovative techniques to analyze CVE descriptions and infer plausible TTP attacks resulting from CVE exploitation. TTPpredictor overcomes challenges posed by limited labeled data and semantic disparities between CVE and TTP descriptions. The paper presents an empirical assessment, demonstrating TTPpredictor's effectiveness with accuracy rates of approximately 98% and F1-scores ranging from 95% to 98% in precise CVE classification to ATT&CK techniques.
arXiv Detail & Related papers (2023-09-06T06:53:45Z)
Dynamic Conceptional Contrastive Learning for Generalized Category Discovery [76.82327473338734]
Generalized category discovery (GCD) aims to automatically cluster partially labeled data. Unlabeled data contain instances that are not only from known categories of the labeled data but also from novel categories. One effective way for GCD is applying self-supervised learning to learn discriminate representation for unlabeled data. We propose a Dynamic Conceptional Contrastive Learning framework, which can effectively improve clustering accuracy.
arXiv Detail & Related papers (2023-03-30T14:04:39Z)
Upcycling Models under Domain and Category Shift [95.22147885947732]
We introduce an innovative global and local clustering learning technique (GLC) We design a novel, adaptive one-vs-all global clustering algorithm to achieve the distinction across different target classes. Remarkably, in the most challenging open-partial-set DA scenario, GLC outperforms UMAD by 14.8% on the VisDA benchmark.
arXiv Detail & Related papers (2023-03-13T13:44:04Z)
Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations. DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals. We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z)
SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption [72.35532598131176]
We propose SCARF, a technique for contrastive learning, where views are formed by corrupting a random subset of features. We show that SCARF complements existing strategies and outperforms alternatives like autoencoders.
arXiv Detail & Related papers (2021-06-29T08:08:33Z)
Cycle Self-Training for Domain Adaptation [85.14659717421533]
Cycle Self-Training (CST) is a principled self-training algorithm that enforces pseudo-labels to generalize across domains. CST recovers target ground truth, while both invariant feature learning and vanilla self-training fail. Empirical results indicate that CST significantly improves over prior state-of-the-arts in standard UDA benchmarks.
arXiv Detail & Related papers (2021-03-05T10:04:25Z)
V2W-BERT: A Framework for Effective Hierarchical Multiclass Classification of Software Vulnerabilities [7.906207218788341]
We present a novel Transformer-based learning framework (V2W-BERT) in this paper. By using ideas from natural language processing, link prediction and transfer learning, our method outperforms previous approaches. We achieve up to 97% prediction accuracy for randomly partitioned data and up to 94% prediction accuracy in temporally partitioned data.
arXiv Detail & Related papers (2021-02-23T05:16:57Z)
Towards Uncovering the Intrinsic Data Structures for Unsupervised Domain Adaptation using Structurally Regularized Deep Clustering [119.88565565454378]
Unsupervised domain adaptation (UDA) is to learn classification models that make predictions for unlabeled data on a target domain. We propose a hybrid model of Structurally Regularized Deep Clustering, which integrates the regularized discriminative clustering of target data with a generative one. Our proposed H-SRDC outperforms all the existing methods under both the inductive and transductive settings.
arXiv Detail & Related papers (2020-12-08T08:52:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.