CVSS-BERT: Explainable Natural Language Processing to Determine the
Severity of a Computer Security Vulnerability from its Description
- URL: http://arxiv.org/abs/2111.08510v1
- Date: Tue, 16 Nov 2021 14:31:09 GMT
- Title: CVSS-BERT: Explainable Natural Language Processing to Determine the
Severity of a Computer Security Vulnerability from its Description
- Authors: Mustafizur Shahid (IP Paris), Herv\'e Debar
- Abstract summary: Cybersecurity experts provide an analysis of the severity of a vulnerability using the Common Vulnerability Scoring System (CVSS)
We propose to leverage recent advances in the field of Natural Language Processing (NLP) to determine the CVSS vector and the associated severity score of a vulnerability in an explainable manner.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When a new computer security vulnerability is publicly disclosed, only a
textual description of it is available. Cybersecurity experts later provide an
analysis of the severity of the vulnerability using the Common Vulnerability
Scoring System (CVSS). Specifically, the different characteristics of the
vulnerability are summarized into a vector (consisting of a set of metrics),
from which a severity score is computed. However, because of the high number of
vulnerabilities disclosed everyday this process requires lot of manpower, and
several days may pass before a vulnerability is analyzed. We propose to
leverage recent advances in the field of Natural Language Processing (NLP) to
determine the CVSS vector and the associated severity score of a vulnerability
from its textual description in an explainable manner. To this purpose, we
trained multiple BERT classifiers, one for each metric composing the CVSS
vector. Experimental results show that our trained classifiers are able to
determine the value of the metrics of the CVSS vector with high accuracy. The
severity score computed from the predicted CVSS vector is also very close to
the real severity score attributed by a human expert. For explainability
purpose, gradient-based input saliency method was used to determine the most
relevant input words for a given prediction made by our classifiers. Often, the
top relevant words include terms in agreement with the rationales of a human
cybersecurity expert, making the explanation comprehensible for end-users.
Related papers
- SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors [64.9938658716425]
Existing evaluations of large language models' (LLMs) ability to recognize and reject unsafe user requests face three limitations.
First, existing methods often use coarse-grained of unsafe topics, and are over-representing some fine-grained topics.
Second, linguistic characteristics and formatting of prompts are often overlooked, like different languages, dialects, and more -- which are only implicitly considered in many evaluations.
Third, existing evaluations rely on large LLMs for evaluation, which can be expensive.
arXiv Detail & Related papers (2024-06-20T17:56:07Z) - The Vulnerability Is in the Details: Locating Fine-grained Information
of Vulnerable Code Identified by Graph-based Detectors [39.01486277170386]
VULEXPLAINER is a tool for locating vulnerability-critical code lines from coarse-level vulnerable code snippets.
It can flag the vulnerability-triggering code statements with an accuracy of around 90% against eight common C/C++ vulnerabilities.
arXiv Detail & Related papers (2024-01-05T10:15:04Z) - ASSERT: Automated Safety Scenario Red Teaming for Evaluating the
Robustness of Large Language Models [65.79770974145983]
ASSERT, Automated Safety Scenario Red Teaming, consists of three methods -- semantically aligned augmentation, target bootstrapping, and adversarial knowledge injection.
We partition our prompts into four safety domains for a fine-grained analysis of how the domain affects model performance.
We find statistically significant performance differences of up to 11% in absolute classification accuracy among semantically related scenarios and error rates of up to 19% absolute error in zero-shot adversarial settings.
arXiv Detail & Related papers (2023-10-14T17:10:28Z) - Gotta Catch 'em All: Aggregating CVSS Scores [1.5839621757142595]
We propose aCVSS aggregation algorithm that integrates information aboutthe functionality of the SUT, exploitation difficulty,existence of exploits, and the context where the SUT operates.
The aggregation algorithm was applied to OpenPLC V3, showing that it is capable of filtering out vulnerabilities that cannot beexploited in the real conditions of deployment.
arXiv Detail & Related papers (2023-10-03T14:04:40Z) - Automated CVE Analysis for Threat Prioritization and Impact Prediction [4.540236408836132]
We introduce our novel predictive model and tool (called CVEDrill) which revolutionizes CVE analysis and threat prioritization.
CVEDrill accurately estimates the Common Vulnerability Scoring System (CVSS) vector for precise threat mitigation and priority ranking.
It seamlessly automates the classification of CVEs into the appropriate Common Weaknession (CWE) hierarchy classes.
arXiv Detail & Related papers (2023-09-06T14:34:03Z) - Vulnerability Clustering and other Machine Learning Applications of
Semantic Vulnerability Embeddings [23.143031911859847]
We investigated different types of semantic vulnerability embeddings based on natural language processing (NLP) techniques.
We also evaluated their use as a foundation for machine learning applications that can support cyber-security researchers and analysts.
The particular applications we explored and briefly summarize are clustering, classification, and visualization.
arXiv Detail & Related papers (2023-08-23T21:39:48Z) - Common Vulnerability Scoring System Prediction based on Open Source
Intelligence Information Sources [0.0]
This work provides a classification of the National Vulnerability Database's reference texts based on the suitability and crawlability of their texts.
While we identified the overall influence of the additional texts is negligible, we outperformed the state-of-the-art with our Deep Learning prediction models.
arXiv Detail & Related papers (2022-10-05T10:54:15Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - Spotting adversarial samples for speaker verification by neural vocoders [102.1486475058963]
We adopt neural vocoders to spot adversarial samples for automatic speaker verification (ASV)
We find that the difference between the ASV scores for the original and re-synthesize audio is a good indicator for discrimination between genuine and adversarial samples.
Our codes will be made open-source for future works to do comparison.
arXiv Detail & Related papers (2021-07-01T08:58:16Z) - Detection of Adversarial Supports in Few-shot Classifiers Using Feature
Preserving Autoencoders and Self-Similarity [89.26308254637702]
We propose a detection strategy to highlight adversarial support sets.
We make use of feature preserving autoencoder filtering and also the concept of self-similarity of a support set to perform this detection.
Our method is attack-agnostic and also the first to explore detection for few-shot classifiers to the best of our knowledge.
arXiv Detail & Related papers (2020-12-09T14:13:41Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.