Related papers: CRepair: CVAE-based Automatic Vulnerability Repair Technology

CRepair: CVAE-based Automatic Vulnerability Repair Technology

URL: http://arxiv.org/abs/2411.05540v1
Date: Fri, 08 Nov 2024 12:55:04 GMT
Title: CRepair: CVAE-based Automatic Vulnerability Repair Technology
Authors: Penghui Liu, Yingzhou Bi, Jiangtao Huang, Xinxin Jiang, Lianmei Wang,
Abstract summary: Software vulnerabilities pose significant threats to the integrity, security, and reliability of modern software and its application data. To address the challenges of vulnerability repair, researchers have proposed various solutions, with learning-based automatic vulnerability repair techniques gaining widespread attention. This paper proposes CRepair, a CVAE-based automatic vulnerability repair technology aimed at fixing security vulnerabilities in system code.
Score: 1.147605955490786
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Software vulnerabilities are flaws in computer software systems that pose significant threats to the integrity, security, and reliability of modern software and its application data. These vulnerabilities can lead to substantial economic losses across various industries. Manual vulnerability repair is not only time-consuming but also prone to errors. To address the challenges of vulnerability repair, researchers have proposed various solutions, with learning-based automatic vulnerability repair techniques gaining widespread attention. However, existing methods often focus on learning more vulnerability data to improve repair outcomes, while neglecting the diverse characteristics of vulnerable code, and suffer from imprecise vulnerability localization.To address these shortcomings, this paper proposes CRepair, a CVAE-based automatic vulnerability repair technology aimed at fixing security vulnerabilities in system code. We first preprocess the vulnerability data using a prompt-based method to serve as input to the model. Then, we apply causal inference techniques to map the vulnerability feature data to probability distributions. By employing multi-sample feature fusion, we capture diverse vulnerability feature information. Finally, conditional control is used to guide the model in repairing the vulnerabilities.Experimental results demonstrate that the proposed method significantly outperforms other benchmark models, achieving a perfect repair rate of 52%. The effectiveness of the approach is validated from multiple perspectives, advancing AI-driven code vulnerability repair and showing promising applications.

Related papers

From Model to Breach: Towards Actionable LLM-Generated Vulnerabilities Reporting [43.57360781012506]
We show that even the latest open-weight models are vulnerable in the earliest reported vulnerability scenarios.<n>We introduce a new severity metric that reflects the risk posed by an LLM-generated vulnerability.<n>To encourage the mitigation of the most serious and prevalent vulnerabilities, we use PE to define the Model Exposure (ME) score.
arXiv Detail & Related papers (2025-11-06T16:52:27Z)
Vul-R2: A Reasoning LLM for Automated Vulnerability Repair [31.00029150695804]
An exponential increase in software vulnerabilities has created an urgent need for automatic vulnerability repair (AVR) solutions.<n>Recent research has formulated as a sequence generation problem and has leveraged large language models (LLMs) to address this problem.<n>Although these approaches show state-of-the-art performance, they face the following challenges.
arXiv Detail & Related papers (2025-10-07T00:43:13Z)
Revisiting Vulnerability Patch Localization: An Empirical Study and LLM-Based Solution [44.388332647211776]
Open-source software vulnerability patch detection is a critical component for maintaining software security and ensuring software supply chain integrity.<n>Traditional detection methods face significant scalability challenges when processing large volumes of commit histories.<n>We propose a novel two-stage framework that combines version-driven candidate filtering with large language model-based multi-round dialogue voting.
arXiv Detail & Related papers (2025-09-19T09:09:55Z)
InfraFix: Technology-Agnostic Repair of Infrastructure as Code [46.798266975184276]
InfraFix is the first technology-agnostic framework for repairing IaC scripts. We demonstrate its effectiveness across 254,755 repair scenarios with a success rate of 95.5%.
arXiv Detail & Related papers (2025-03-21T15:24:54Z)
There are More Fish in the Sea: Automated Vulnerability Repair via Binary Templates [4.907610470063863]
We propose a template-based automated vulnerability repair approach for Java binaries. Experiments on the Vul4J dataset demonstrate that TemVUR successfully repairs 11 vulnerabilities. To assess the generalizability of TemVUR, we curate the ManyVuls4J dataset.
arXiv Detail & Related papers (2024-11-27T06:59:45Z)
In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models [97.82118821263825]
Text-to-image (T2I) models have shown remarkable progress, but their potential to generate harmful content remains a critical concern in the ML community. We propose ICER, a novel red-teaming framework that generates interpretable and semantic meaningful problematic prompts. Our work provides crucial insights for developing more robust safety mechanisms in T2I systems.
arXiv Detail & Related papers (2024-11-25T04:17:24Z)
Enhancing Pre-Trained Language Models for Vulnerability Detection via Semantic-Preserving Data Augmentation [4.374800396968465]
We propose a data augmentation technique aimed at enhancing the performance of pre-trained language models for vulnerability detection. By incorporating our augmented dataset in fine-tuning a series of representative code pre-trained models, up to 10.1% increase in accuracy and 23.6% increase in F1 can be achieved.
arXiv Detail & Related papers (2024-09-30T21:44:05Z)
FaultGuard: A Generative Approach to Resilient Fault Prediction in Smart Electrical Grids [53.2306792009435]
FaultGuard is the first framework for fault type and zone classification resilient to adversarial attacks. We propose a low-complexity fault prediction model and an online adversarial training technique to enhance robustness. Our model outclasses the state-of-the-art for resilient fault prediction benchmarking, with an accuracy of up to 0.958.
arXiv Detail & Related papers (2024-03-26T08:51:23Z)
Causative Insights into Open Source Software Security using Large Language Code Embeddings and Semantic Vulnerability Graph [3.623199159688412]
Open Source Software (OSS) vulnerabilities can cause unauthorized access, data breaches, network disruptions, and privacy violations. Recent deep-learning techniques have shown great promise in identifying and localizing vulnerabilities in source code. Our study shows a 24% improvement in code repair capabilities compared to previous methods.
arXiv Detail & Related papers (2024-01-13T10:33:22Z)
REEF: A Framework for Collecting Real-World Vulnerabilities and Fixes [40.401211102969356]
We propose an automated collecting framework REEF to collect REal-world vulnErabilities and Fixes from open-source repositories. We develop a multi-language crawler to collect vulnerabilities and their fixes, and design metrics to filter for high-quality vulnerability-fix pairs. Through extensive experiments, we demonstrate that our approach can collect high-quality vulnerability-fix pairs and generate strong explanations.
arXiv Detail & Related papers (2023-09-15T02:50:08Z)
Enabling Automatic Repair of Source Code Vulnerabilities Using Data-Driven Methods [0.4568777157687961]
We propose ways to improve code representations for vulnerability repair from three perspectives. Data-driven models of automatic program repair use pairs of buggy and fixed code to learn transformations that fix errors in code. The expected results of this work are improved code representations for automatic program repair and, specifically, fixing security vulnerabilities.
arXiv Detail & Related papers (2022-02-07T10:47:37Z)
VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code. Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph. VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z)
Federated Learning with Unreliable Clients: Performance Analysis and Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients. However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training. We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z)
V2W-BERT: A Framework for Effective Hierarchical Multiclass Classification of Software Vulnerabilities [7.906207218788341]
We present a novel Transformer-based learning framework (V2W-BERT) in this paper. By using ideas from natural language processing, link prediction and transfer learning, our method outperforms previous approaches. We achieve up to 97% prediction accuracy for randomly partitioned data and up to 94% prediction accuracy in temporally partitioned data.
arXiv Detail & Related papers (2021-02-23T05:16:57Z)
Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance. We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.