Related papers: Enriching Vulnerability Reports Through Automated and Augmented Description Summarization

Enriching Vulnerability Reports Through Automated and Augmented Description Summarization

URL: http://arxiv.org/abs/2210.01260v1
Date: Mon, 3 Oct 2022 22:46:35 GMT
Title: Enriching Vulnerability Reports Through Automated and Augmented Description Summarization
Authors: Hattan Althebeiti and David Mohaisen
Abstract summary: Vulnerability descriptions play an important role in communicating the vulnerability information to security analysts. This paper devises a pipeline to augment vulnerability description through third party reference (hyperlink) scrapping.
Score: 6.3455238301221675
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Security incidents and data breaches are increasing rapidly, and only a fraction of them is being reported. Public vulnerability databases, e.g., national vulnerability database (NVD) and common vulnerability and exposure (CVE), have been leading the effort in documenting vulnerabilities and sharing them to aid defenses. Both are known for many issues, including brief vulnerability descriptions. Those descriptions play an important role in communicating the vulnerability information to security analysts in order to develop the appropriate countermeasure. Many resources provide additional information about vulnerabilities, however, they are not utilized to boost public repositories. In this paper, we devise a pipeline to augment vulnerability description through third party reference (hyperlink) scrapping. To normalize the description, we build a natural language summarization pipeline utilizing a pretrained language model that is fine-tuned using labeled instances and evaluate its performance against both human evaluation (golden standard) and computational metrics, showing initial promising results in terms of summary fluency, completeness, correctness, and understanding.

Related papers

Enhancing Vulnerability Reports with Automated and Augmented Description Summarization [12.179366049460555]
We introduce Zad, a system designed to enrich vulnerability descriptions by leveraging external resources. Zad consists of two pipelines: one collects and filters supplementary data using two encoders to build a detailed dataset, while the other fine-tunes a pre-trained model on this dataset to generate enriched descriptions. We evaluate Zad using standard summarization metrics and human assessments, demonstrating its effectiveness in enhancing vulnerability information.
arXiv Detail & Related papers (2025-04-29T13:08:27Z)
Advancing Vulnerability Classification with BERT: A Multi-Objective Learning Model [0.0]
This paper presents a novel Vulnerability Report that leverages the BERT (Bi Representations from Transformers) model to perform multi-label classification. The system is deployed via a REST API and a Streamlit UI, enabling real-time vulnerability analysis.
arXiv Detail & Related papers (2025-03-26T06:04:45Z)
REVAL: A Comprehension Evaluation on Reliability and Values of Large Vision-Language Models [59.445672459851274]
REVAL is a comprehensive benchmark designed to evaluate the textbfREliability and textbfVALue of Large Vision-Language Models. REVAL encompasses over 144K image-text Visual Question Answering (VQA) samples, structured into two primary sections: Reliability and Values. We evaluate 26 models, including mainstream open-source LVLMs and prominent closed-source models like GPT-4o and Gemini-1.5-Pro.
arXiv Detail & Related papers (2025-03-20T07:54:35Z)
Illusions of Relevance: Using Content Injection Attacks to Deceive Retrievers, Rerankers, and LLM Judges [52.96987928118327]
We find that embedding models for retrieval, rerankers, and large language model (LLM) relevance judges are vulnerable to content injection attacks. We identify two primary threats: (1) inserting unrelated or harmful content within passages that still appear deceptively "relevant", and (2) inserting entire queries or key query terms into passages to boost their perceived relevance. Our study systematically examines the factors that influence an attack's success, such as the placement of injected content and the balance between relevant and non-relevant material.
arXiv Detail & Related papers (2025-01-30T18:02:15Z)
Backdooring Vision-Language Models with Out-Of-Distribution Data [44.40928756056506]
Vision-Language Models (VLMs) generate detailed text descriptions from visual inputs. Despite their growing importance, the security of VLMs, particularly against backdoor attacks, is under explored. We introduce VLOOD (Backdooring Vision-Language Models with Out-of-Distribution Data), a novel approach with two key contributions.
arXiv Detail & Related papers (2024-10-02T06:21:00Z)
Learning Graph-based Patch Representations for Identifying and Assessing Silent Vulnerability Fixes [5.983725940750908]
Software projects are dependent on many third-party libraries, therefore high-risk vulnerabilities can propagate through the dependency chain to downstream projects. Silent vulnerability fixes cause downstream software to be unaware of urgent security issues in a timely manner, posing a security risk to the software. We propose GRAPE, a GRAph-based Patch rEpresentation that aims to provide a unified framework for getting vulnerability fix patches representation.
arXiv Detail & Related papers (2024-09-13T03:23:11Z)
Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding [118.75567341513897]
Existing methods typically analyze target text in isolation or solely with non-member contexts. We propose Con-ReCall, a novel approach that leverages the asymmetric distributional shifts induced by member and non-member contexts.
arXiv Detail & Related papers (2024-09-05T09:10:38Z)
Robust Utility-Preserving Text Anonymization Based on Large Language Models [80.5266278002083]
Text anonymization is crucial for sharing sensitive data while maintaining privacy. Existing techniques face the emerging challenges of re-identification attack ability of Large Language Models. This paper proposes a framework composed of three LLM-based components -- a privacy evaluator, a utility evaluator, and an optimization component.
arXiv Detail & Related papers (2024-07-16T14:28:56Z)
"Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models [74.05368440735468]
Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs) In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases.
arXiv Detail & Related papers (2024-06-26T05:36:23Z)
VulZoo: A Comprehensive Vulnerability Intelligence Dataset [12.229092589037808]
VulZoo is a comprehensive vulnerability intelligence dataset that covers 17 popular vulnerability information sources. We make VulZoo publicly available and maintain it with incremental updates to facilitate future research.
arXiv Detail & Related papers (2024-06-24T06:39:07Z)
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models [79.0183835295533]
We introduce the first benchmark for indirect prompt injection attacks, named BIPIA, to assess the risk of such vulnerabilities. Our analysis identifies two key factors contributing to their success: LLMs' inability to distinguish between informational context and actionable instructions, and their lack of awareness in avoiding the execution of instructions within external content. We propose two novel defense mechanisms-boundary awareness and explicit reminder-to address these vulnerabilities in both black-box and white-box settings.
arXiv Detail & Related papers (2023-12-21T01:08:39Z)
CVE representation to build attack positions graphs [0.39945675027960637]
In cybersecurity, CVEs (Common Vulnerabilities and Exposures) are publicly disclosed hardware or software vulnerabilities. This article points out that these vulnerabilities should be described in greater detail to understand how they could be chained together in a complete attack scenario.
arXiv Detail & Related papers (2023-12-05T08:57:14Z)
Just-in-Time Detection of Silent Security Patches [7.840762542485285]
Security patches can be em silent, i.e., they do not always come with comprehensive advisories such as CVEs. This lack of transparency leaves users oblivious to available security updates, providing ample opportunity for attackers to exploit unpatched vulnerabilities. We propose to leverage large language models (LLMs) to augment patch information with generated code change explanations.
arXiv Detail & Related papers (2023-12-02T22:53:26Z)
Vulnerability Clustering and other Machine Learning Applications of Semantic Vulnerability Embeddings [23.143031911859847]
We investigated different types of semantic vulnerability embeddings based on natural language processing (NLP) techniques. We also evaluated their use as a foundation for machine learning applications that can support cyber-security researchers and analysts. The particular applications we explored and briefly summarize are clustering, classification, and visualization.
arXiv Detail & Related papers (2023-08-23T21:39:48Z)
Online Safety Property Collection and Refinement for Safe Deep Reinforcement Learning in Mapless Navigation [79.89605349842569]
We introduce the Collection and Refinement of Online Properties (CROP) framework to design properties at training time. CROP employs a cost signal to identify unsafe interactions and use them to shape safety properties. We evaluate our approach in several robotic mapless navigation tasks and demonstrate that the violation metric computed with CROP allows higher returns and lower violations over previous Safe DRL approaches.
arXiv Detail & Related papers (2023-02-13T21:19:36Z)
VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code. Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph. VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z)
Reinforcement Learning on Encrypted Data [58.39270571778521]
We present a preliminary, experimental study of how a DQN agent trained on encrypted states performs in environments with discrete and continuous state spaces. Our results highlight that the agent is still capable of learning in small state spaces even in presence of non-deterministic encryption, but performance collapses in more complex environments.
arXiv Detail & Related papers (2021-09-16T21:59:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.