Common Vulnerability Scoring System Prediction based on Open Source
Intelligence Information Sources
- URL: http://arxiv.org/abs/2210.02143v1
- Date: Wed, 5 Oct 2022 10:54:15 GMT
- Title: Common Vulnerability Scoring System Prediction based on Open Source
Intelligence Information Sources
- Authors: Philipp Kuehn, David N. Relke, Christian Reuter
- Abstract summary: This work provides a classification of the National Vulnerability Database's reference texts based on the suitability and crawlability of their texts.
While we identified the overall influence of the additional texts is negligible, we outperformed the state-of-the-art with our Deep Learning prediction models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The number of newly published vulnerabilities is constantly increasing. Until
now, the information available when a new vulnerability is published is
manually assessed by experts using a Common Vulnerability Scoring System (CVSS)
vector and score. This assessment is time consuming and requires expertise.
Various works already try to predict CVSS vectors or scores using machine
learning based on the textual descriptions of the vulnerability to enable
faster assessment. However, for this purpose, previous works only use the texts
available in databases such as National Vulnerability Database. With this work,
the publicly available web pages referenced in the National Vulnerability
Database are analyzed and made available as sources of texts through web
scraping. A Deep Learning based method for predicting the CVSS vector is
implemented and evaluated. The present work provides a classification of the
National Vulnerability Database's reference texts based on the suitability and
crawlability of their texts. While we identified the overall influence of the
additional texts is negligible, we outperformed the state-of-the-art with our
Deep Learning prediction models.
Related papers
- Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.
We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.
Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - DFEPT: Data Flow Embedding for Enhancing Pre-Trained Model Based Vulnerability Detection [7.802093464108404]
We propose a data flow embedding technique to enhance the performance of pre-trained models in vulnerability detection tasks.
Specifically, we parse data flow graphs from function-level source code, and use the data type of the variable as the node characteristics of the DFG.
Our research shows that DFEPT can provide effective vulnerability semantic information to pre-trained models, achieving an accuracy of 64.97% on the Devign dataset and an F1-Score of 47.9% on the Reveal dataset.
arXiv Detail & Related papers (2024-10-24T07:05:07Z) - Instruct and Extract: Instruction Tuning for On-Demand Information
Extraction [86.29491354355356]
On-Demand Information Extraction aims to fulfill the personalized demands of real-world users.
We present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set.
Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE.
arXiv Detail & Related papers (2023-10-24T17:54:25Z) - Vulnerability Clustering and other Machine Learning Applications of
Semantic Vulnerability Embeddings [23.143031911859847]
We investigated different types of semantic vulnerability embeddings based on natural language processing (NLP) techniques.
We also evaluated their use as a foundation for machine learning applications that can support cyber-security researchers and analysts.
The particular applications we explored and briefly summarize are clustering, classification, and visualization.
arXiv Detail & Related papers (2023-08-23T21:39:48Z) - Harnessing the Power of Text-image Contrastive Models for Automatic
Detection of Online Misinformation [50.46219766161111]
We develop a self-learning model to explore the constrastive learning in the domain of misinformation identification.
Our model shows the superior performance of non-matched image-text pair detection when the training data is insufficient.
arXiv Detail & Related papers (2023-04-19T02:53:59Z) - Verifying the Robustness of Automatic Credibility Assessment [79.08422736721764]
Text classification methods have been widely investigated as a way to detect content of low credibility.
In some cases insignificant changes in input text can mislead the models.
We introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Enriching Vulnerability Reports Through Automated and Augmented
Description Summarization [6.3455238301221675]
Vulnerability descriptions play an important role in communicating the vulnerability information to security analysts.
This paper devises a pipeline to augment vulnerability description through third party reference (hyperlink) scrapping.
arXiv Detail & Related papers (2022-10-03T22:46:35Z) - Learning-based Hybrid Local Search for the Hard-label Textual Attack [53.92227690452377]
We consider a rarely investigated but more rigorous setting, namely hard-label attack, in which the attacker could only access the prediction label.
Based on this observation, we propose a novel hard-label attack, called Learning-based Hybrid Local Search (LHLS) algorithm.
Our LHLS significantly outperforms existing hard-label attacks regarding the attack performance as well as adversary quality.
arXiv Detail & Related papers (2022-01-20T14:16:07Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - CVSS-BERT: Explainable Natural Language Processing to Determine the
Severity of a Computer Security Vulnerability from its Description [0.0]
Cybersecurity experts provide an analysis of the severity of a vulnerability using the Common Vulnerability Scoring System (CVSS)
We propose to leverage recent advances in the field of Natural Language Processing (NLP) to determine the CVSS vector and the associated severity score of a vulnerability in an explainable manner.
arXiv Detail & Related papers (2021-11-16T14:31:09Z) - Examining Redundancy in the Context of Safe Machine Learning [0.0]
This paper describes a set of experiments with neural network classifiers on the MNIST database of digits.
We report on a set of measurements using the MNIST database which ultimately serve to underline the expected difficulties in using NN classifiers in safe and dependable systems.
arXiv Detail & Related papers (2020-07-03T18:23:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.