SVA-ICL: Improving LLM-based Software Vulnerability Assessment via In-Context Learning and Information Fusion
- URL: http://arxiv.org/abs/2505.10008v2
- Date: Wed, 28 May 2025 08:37:29 GMT
- Title: SVA-ICL: Improving LLM-based Software Vulnerability Assessment via In-Context Learning and Information Fusion
- Authors: Chaoyang Gao, Xiang Chen, Guangbei Zhang,
- Abstract summary: We introduce a novel approach SVA-ICL, which leverages in-context learning (ICL) to enhance large language models (LLMs) performance.<n>We evaluate the effectiveness of SVA-ICL using a large-scale dataset comprising 12,071 C/C++ vulnerabilities.
- Score: 5.06185582943982
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Context: Software vulnerability assessment (SVA) is critical for identifying, evaluating, and prioritizing security weaknesses in software applications. Objective: Despite the increasing application of large language models (LLMs) in various software engineering tasks, their effectiveness in SVA remains underexplored. Method: To address this gap, we introduce a novel approach SVA-ICL, which leverages in-context learning (ICL) to enhance LLM performance. Our approach involves the selection of high-quality demonstrations for ICL through information fusion, incorporating both source code and vulnerability descriptions. For source code, we consider semantic, lexical, and syntactic similarities, while for vulnerability descriptions, we focus on textual similarity. Based on the selected demonstrations, we construct context prompts and consider DeepSeek-V2 as the LLM for SVA-ICL. Results: We evaluate the effectiveness of SVA-ICL using a large-scale dataset comprising 12,071 C/C++ vulnerabilities. Experimental results demonstrate that SVA-ICL outperforms state-of-the-art SVA baselines in terms of Accuracy, F1-score, and MCC measures. Furthermore, ablation studies highlight the significance of component customization in SVA-ICL, such as the number of demonstrations, the demonstration ordering strategy, and the optimal fusion ratio of different modalities. Conclusion: Our findings suggest that leveraging ICL with information fusion can effectively improve the effectiveness of LLM-based SVA, warranting further research in this direction.
Related papers
- ReVul-CoT: Towards Effective Software Vulnerability Assessment with Retrieval-Augmented Generation and Chain-of-Thought Prompting [9.735224996021591]
We propose a novel framework that integrates Retrieval-Augmented Generation (RAG) with Chain-of-Thought (COT) prompting.<n>In ReVul-CoT, the RAG module dynamically retrieves contextually relevant information from a constructed local knowledge base.<n>Building on DeepSeek-V3.1, CoT prompting guides the LLM to perform step-by-step reasoning over exploitability, impact scope, and related factors.
arXiv Detail & Related papers (2025-11-21T08:01:49Z) - VULPO: Context-Aware Vulnerability Detection via On-Policy LLM Optimization [2.6678231901651723]
This paper introduces Vulnerability-Adaptive Policy Optimization (VULPO), an on-policy LLM reinforcement learning framework for context-aware vulnerability detection.<n>To support training and evaluation, we first construct ContextVul, a new dataset that augments high-quality function-level samples with lightweight method to extract repository-level context information.<n>To address the asymmetric difficulty of different vulnerability cases and mitigate reward hacking, VULPO incorporates label-level and sample-level difficulty-adaptive reward scaling.
arXiv Detail & Related papers (2025-11-14T21:57:48Z) - Real-VulLLM: An LLM Based Assessment Framework in the Wild [0.7408058999454915]
Large Language Models (LLMs) have demonstrated exceptional progress in software engineering.<n>Their capability for vulnerability detection in the wild scenario and its corresponding reasoning remains underexplored.<n>Our contributions are (i)varied prompt designs for vulnerability detection and its corresponding reasoning in the wild.
arXiv Detail & Related papers (2025-10-05T06:34:30Z) - When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering using Small VLMs [4.296395082987112]
Large Vision-Language Models (L-VLMs) have demonstrated remarkable performance in various vision and language tasks.<n>Small Vision-Language Models (S-VLMs) offer efficiency but suffer from a significant performance gap compared to their larger counterparts.<n>We introduce the Model Parity Aligner (MPA), a novel framework designed to systematically improve S-VLMs.
arXiv Detail & Related papers (2025-09-20T11:12:23Z) - Learning-to-Context Slope: Evaluating In-Context Learning Effectiveness Beyond Performance Illusions [42.80928434779115]
In-context learning (ICL) has emerged as an effective approach to enhance the performance of large language models.<n>Current evaluation approaches suffer from low reliability, poor attribution, and impracticality in data-insufficient scenarios.<n>We propose the Learning-to-Context Slope (LCS), a novel metric that quantifies ICL effectiveness by modeling the slope between learning gain and contextual relevance.
arXiv Detail & Related papers (2025-06-29T08:55:37Z) - OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics [101.78963920333342]
We introduce OpenUnlearning, a standardized framework for benchmarking large language models (LLMs) unlearning methods and metrics.<n>OpenUnlearning integrates 9 unlearning algorithms and 16 diverse evaluations across 3 leading benchmarks.<n>We also benchmark diverse unlearning methods and provide a comparative analysis against an extensive evaluation suite.
arXiv Detail & Related papers (2025-06-14T20:16:37Z) - Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering [66.5524727179286]
NOVA is a framework designed to identify high-quality data that aligns well with the learned knowledge to reduce hallucinations.<n>It includes Internal Consistency Probing (ICP) and Semantic Equivalence Identification (SEI) to measure how familiar the LLM is with instruction data.<n>To ensure the quality of selected samples, we introduce an expert-aligned reward model, considering characteristics beyond just familiarity.
arXiv Detail & Related papers (2025-02-11T08:05:56Z) - A Soft Sensor Method with Uncertainty-Awareness and Self-Explanation Based on Large Language Models Enhanced by Domain Knowledge Retrieval [17.605817344542345]
We propose a framework called Few-shot Uncertainty-aware and self-Explaining Soft Sensor (LLM-FUESS)<n>LLM-FUESS includes the Zero-shot Auxiliary Variable Selector (LLM-ZAVS) and the Uncertainty-aware Few-shot Soft Sensor (LLM-UFSS)<n>Our method achieved state-of-the-art predictive performance, strong robustness, and flexibility, effectively mitigates training instability found in traditional methods.
arXiv Detail & Related papers (2025-01-06T11:43:29Z) - Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset [92.99416966226724]
We introduce Facial Identity Unlearning Benchmark (FIUBench), a novel VLM unlearning benchmark designed to robustly evaluate the effectiveness of unlearning algorithms.<n>We apply a two-stage evaluation pipeline that is designed to precisely control the sources of information and their exposure levels.<n>Through the evaluation of four baseline VLM unlearning algorithms within FIUBench, we find that all methods remain limited in their unlearning performance.
arXiv Detail & Related papers (2024-11-05T23:26:10Z) - SAFE: Advancing Large Language Models in Leveraging Semantic and Syntactic Relationships for Software Vulnerability Detection [23.7268575752712]
Software vulnerabilities (SVs) have emerged as a prevalent and critical concern for safety-critical security systems.
We propose a novel framework that enhances the capability of large language models to learn and utilize semantic and syntactic relationships from source code data for SVD.
arXiv Detail & Related papers (2024-09-02T00:49:02Z) - Evaluating Linguistic Capabilities of Multimodal LLMs in the Lens of Few-Shot Learning [15.919493497867567]
This study aims to evaluate the performance of Multimodal Large Language Models (MLLMs) on the VALSE benchmark.
We conducted a comprehensive assessment of state-of-the-art MLLMs, varying in model size and pretraining datasets.
arXiv Detail & Related papers (2024-07-17T11:26:47Z) - Comprehensive Reassessment of Large-Scale Evaluation Outcomes in LLMs: A Multifaceted Statistical Approach [64.42462708687921]
Evaluations have revealed that factors such as scaling, training types, architectures and other factors profoundly impact the performance of LLMs.
Our study embarks on a thorough re-examination of these LLMs, targeting the inadequacies in current evaluation methods.
This includes the application of ANOVA, Tukey HSD tests, GAMM, and clustering technique.
arXiv Detail & Related papers (2024-03-22T14:47:35Z) - Towards Multimodal In-Context Learning for Vision & Language Models [21.69457980865084]
State-of-the-art Vision-Language Models (VLMs) ground the vision and the language modality.
We propose a simple yet surprisingly effective multi-turn curriculum-based learning methodology with effective data mixes.
arXiv Detail & Related papers (2024-03-19T13:53:37Z) - In-Context Learning Demonstration Selection via Influence Analysis [11.504012974208466]
Large Language Models (LLMs) have showcased their In-Context Learning (ICL) capabilities.
Despite its advantages, the effectiveness of ICL heavily depends on the choice of demonstrations.
We propose a demonstration selection method named InfICL, which utilizes influence functions to analyze impacts of training samples.
arXiv Detail & Related papers (2024-02-19T00:39:31Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics.
We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations.
We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.