Related papers: Constrained Decoding for Secure Code Generation

Constrained Decoding for Secure Code Generation

URL: http://arxiv.org/abs/2405.00218v3
Date: Sat, 20 Jul 2024 19:14:03 GMT
Title: Constrained Decoding for Secure Code Generation
Authors: Yanjun Fu, Ethan Baker, Yu Ding, Yizheng Chen,
Abstract summary: This paper introduces a new benchmark, CodeGuard+, to measure Code LLMs' ability to generate both secure and correct code. We show that the state-of-the-art defense technique, prefix tuning, may not be as strong as previously believed, since it generates secure code but sacrifices functional correctness. We propose new constrained decoding techniques to generate secure code.
Score: 9.007821185927277
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Code Large Language Models (Code LLMs) have been increasingly used by developers to boost productivity, but they often generate vulnerable code. Thus, there is an urgent need to ensure that code generated by Code LLMs is correct and secure. Previous research has primarily focused on generating secure code, overlooking the fact that secure code also needs to be correct. This oversight can lead to a false sense of security. Currently, the community lacks a method to measure actual progress in this area, and we need solutions that address both security and correctness of code generation. This paper introduces a new benchmark, CodeGuard+, along with two new metrics, to measure Code LLMs' ability to generate both secure and correct code. Using our new evaluation methods, we show that the state-of-the-art defense technique, prefix tuning, may not be as strong as previously believed, since it generates secure code but sacrifices functional correctness. We also demonstrate that different decoding methods significantly affect the security of Code LLMs. Furthermore, we explore a new defense direction: constrained decoding for secure code generation. We propose new constrained decoding techniques to generate secure code. Our results reveal that constrained decoding is more effective than prefix tuning to improve the security of Code LLMs, without requiring a specialized training dataset. Moreover, our evaluations over eight state-of-the-art Code LLMs show that constrained decoding has strong performance to improve the security of Code LLMs, and our technique outperforms GPT-4.

Related papers

Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model [60.60587869092729]
Large language models (LLMs) are increasingly used in software development, yet their tendency to generate insecure code remains a major barrier to real-world deployment.<n>We propose SecCoderX, an online reinforcement learning framework for functionality-preserving secure code generation.
arXiv Detail & Related papers (2026-02-07T07:42:07Z)
Teaching an Old LLM Secure Coding: Localized Preference Optimization on Distilled Preferences [19.791588510105687]
We address two key challenges in improving secure code generation.<n>First, obtaining high quality training data covering a broad set of security issues is critical.<n>Second, aligning models to secure code requires focusing on localized regions of code.
arXiv Detail & Related papers (2025-05-31T06:48:12Z)
Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z)
Smoke and Mirrors: Jailbreaking LLM-based Code Generation via Implicit Malicious Prompts [5.718926328180089]
This paper introduces a jailbreaking approach, CodeJailbreaker, designed to uncover safety concerns in code generation. Experiments on the recently-released RMCBench benchmark demonstrate that CodeJailbreaker markedly surpasses the conventional jailbreaking strategy.
arXiv Detail & Related papers (2025-03-23T06:06:12Z)
ProSec: Fortifying Code LLMs with Proactive Security Alignment [14.907702430331803]
Security of code-specific large language models (LLMs) remains under-explored. We propose ProSec, a novel security alignment approach designed to align code LLMs with secure coding practices. Experiments show that models trained with ProSec is 29.2% to 35.5% more secure compared to previous work.
arXiv Detail & Related papers (2024-11-19T22:00:01Z)
HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data [60.75578581719921]
Large language models (LLMs) have shown great potential for automatic code generation. Recent studies highlight that many LLM-generated code contains serious security vulnerabilities. We introduce HexaCoder, a novel approach to enhance the ability of LLMs to generate secure codes.
arXiv Detail & Related papers (2024-09-10T12:01:43Z)
Can We Trust Large Language Models Generated Code? A Framework for In-Context Learning, Security Patterns, and Code Evaluations Across Diverse LLMs [2.7138982369416866]
Large Language Models (LLMs) have revolutionized automated code generation in software engineering. However, concerns have arisen regarding the security and quality of the generated code. Our research aims to tackle these issues by introducing a framework for secure behavioral learning of LLMs.
arXiv Detail & Related papers (2024-06-18T11:29:34Z)
Learning Linear Block Error Correction Codes [62.25533750469467]
We propose for the first time a unified encoder-decoder training of binary linear block codes. We also propose a novel Transformer model in which the self-attention masking is performed in a differentiable fashion for the efficient backpropagation of the code gradient.
arXiv Detail & Related papers (2024-05-07T06:47:12Z)
CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code [56.019447113206006]
Large Language Models (LLMs) have achieved remarkable progress in code generation. CodeIP is a novel multi-bit watermarking technique that embeds additional information to preserve provenance details. Experiments conducted on a real-world dataset across five programming languages demonstrate the effectiveness of CodeIP.
arXiv Detail & Related papers (2024-04-24T04:25:04Z)
CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion [117.178835165855]
This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs. Our studies reveal a new and universal safety vulnerability of these models against code input. We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization.
arXiv Detail & Related papers (2024-03-12T17:55:38Z)
Assured LLM-Based Software Engineering [51.003878077888686]
This paper is an outline of the content of the keynote by Mark Harman at the International Workshop on Interpretability, Robustness, and Benchmarking in Neural Software Engineering, Monday 15th April 2024, Lisbon, Portugal.
arXiv Detail & Related papers (2024-02-06T20:38:46Z)
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components. CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks. FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization. Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z)
Code Security Vulnerability Repair Using Reinforcement Learning with Large Language Models [1.5457286059556397]
We propose a reinforcement learning-based method for security hardening and strengthening of generated code from Large Language Models (LLMs) In this work, we propose a reinforcement learning-based method for program-specific repair with the combination of semantic and syntactic reward mechanisms that focus heavily on adding security and functional measures in the code, respectively.
arXiv Detail & Related papers (2024-01-13T10:19:26Z)
SALLM: Security Assessment of Generated Code [0.5137309756089941]
This paper describes SALLM, a framework to benchmark Large Language Models' abilities to generate secure code systematically. The framework has three major components: a novel dataset of security-centric Python prompts, assessment techniques to evaluate the generated code, and novel metrics to evaluate the models' performance from the perspective of secure code generation.
arXiv Detail & Related papers (2023-11-01T22:46:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.