Enriching Vulnerability Reports Through Automated and Augmented
Description Summarization
- URL: http://arxiv.org/abs/2210.01260v1
- Date: Mon, 3 Oct 2022 22:46:35 GMT
- Title: Enriching Vulnerability Reports Through Automated and Augmented
Description Summarization
- Authors: Hattan Althebeiti and David Mohaisen
- Abstract summary: Vulnerability descriptions play an important role in communicating the vulnerability information to security analysts.
This paper devises a pipeline to augment vulnerability description through third party reference (hyperlink) scrapping.
- Score: 6.3455238301221675
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Security incidents and data breaches are increasing rapidly, and only a
fraction of them is being reported. Public vulnerability databases, e.g.,
national vulnerability database (NVD) and common vulnerability and exposure
(CVE), have been leading the effort in documenting vulnerabilities and sharing
them to aid defenses. Both are known for many issues, including brief
vulnerability descriptions. Those descriptions play an important role in
communicating the vulnerability information to security analysts in order to
develop the appropriate countermeasure. Many resources provide additional
information about vulnerabilities, however, they are not utilized to boost
public repositories. In this paper, we devise a pipeline to augment
vulnerability description through third party reference (hyperlink) scrapping.
To normalize the description, we build a natural language summarization
pipeline utilizing a pretrained language model that is fine-tuned using labeled
instances and evaluate its performance against both human evaluation (golden
standard) and computational metrics, showing initial promising results in terms
of summary fluency, completeness, correctness, and understanding.
Related papers
- Illusions of Relevance: Using Content Injection Attacks to Deceive Retrievers, Rerankers, and LLM Judges [52.96987928118327]
We find that embedding models for retrieval, rerankers, and large language model (LLM) relevance judges are vulnerable to content injection attacks.
We identify two primary threats: (1) inserting unrelated or harmful content within passages that still appear deceptively "relevant", and (2) inserting entire queries or key query terms into passages to boost their perceived relevance.
Our study systematically examines the factors that influence an attack's success, such as the placement of injected content and the balance between relevant and non-relevant material.
arXiv Detail & Related papers (2025-01-30T18:02:15Z) - Learning Graph-based Patch Representations for Identifying and Assessing Silent Vulnerability Fixes [5.983725940750908]
Software projects are dependent on many third-party libraries, therefore high-risk vulnerabilities can propagate through the dependency chain to downstream projects.
Silent vulnerability fixes cause downstream software to be unaware of urgent security issues in a timely manner, posing a security risk to the software.
We propose GRAPE, a GRAph-based Patch rEpresentation that aims to provide a unified framework for getting vulnerability fix patches representation.
arXiv Detail & Related papers (2024-09-13T03:23:11Z) - Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding [118.75567341513897]
Existing methods typically analyze target text in isolation or solely with non-member contexts.
We propose Con-ReCall, a novel approach that leverages the asymmetric distributional shifts induced by member and non-member contexts.
arXiv Detail & Related papers (2024-09-05T09:10:38Z) - Robust Utility-Preserving Text Anonymization Based on Large Language Models [80.5266278002083]
Text anonymization is crucial for sharing sensitive data while maintaining privacy.
Existing techniques face the emerging challenges of re-identification attack ability of Large Language Models.
This paper proposes a framework composed of three LLM-based components -- a privacy evaluator, a utility evaluator, and an optimization component.
arXiv Detail & Related papers (2024-07-16T14:28:56Z) - "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models [74.05368440735468]
Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs)
In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases.
arXiv Detail & Related papers (2024-06-26T05:36:23Z) - VulZoo: A Comprehensive Vulnerability Intelligence Dataset [12.229092589037808]
VulZoo is a comprehensive vulnerability intelligence dataset that covers 17 popular vulnerability information sources.
We make VulZoo publicly available and maintain it with incremental updates to facilitate future research.
arXiv Detail & Related papers (2024-06-24T06:39:07Z) - Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models [79.0183835295533]
We introduce the first benchmark for indirect prompt injection attacks, named BIPIA, to assess the risk of such vulnerabilities.
Our analysis identifies two key factors contributing to their success: LLMs' inability to distinguish between informational context and actionable instructions, and their lack of awareness in avoiding the execution of instructions within external content.
We propose two novel defense mechanisms-boundary awareness and explicit reminder-to address these vulnerabilities in both black-box and white-box settings.
arXiv Detail & Related papers (2023-12-21T01:08:39Z) - CVE representation to build attack positions graphs [0.39945675027960637]
In cybersecurity, CVEs (Common Vulnerabilities and Exposures) are publicly disclosed hardware or software vulnerabilities.
This article points out that these vulnerabilities should be described in greater detail to understand how they could be chained together in a complete attack scenario.
arXiv Detail & Related papers (2023-12-05T08:57:14Z) - Just-in-Time Detection of Silent Security Patches [7.840762542485285]
Security patches can be em silent, i.e., they do not always come with comprehensive advisories such as CVEs.
This lack of transparency leaves users oblivious to available security updates, providing ample opportunity for attackers to exploit unpatched vulnerabilities.
We propose to leverage large language models (LLMs) to augment patch information with generated code change explanations.
arXiv Detail & Related papers (2023-12-02T22:53:26Z) - Vulnerability Clustering and other Machine Learning Applications of
Semantic Vulnerability Embeddings [23.143031911859847]
We investigated different types of semantic vulnerability embeddings based on natural language processing (NLP) techniques.
We also evaluated their use as a foundation for machine learning applications that can support cyber-security researchers and analysts.
The particular applications we explored and briefly summarize are clustering, classification, and visualization.
arXiv Detail & Related papers (2023-08-23T21:39:48Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.