Large Language Models for Code: Security Hardening and Adversarial
Testing
- URL: http://arxiv.org/abs/2302.05319v4
- Date: Fri, 29 Sep 2023 13:53:43 GMT
- Title: Large Language Models for Code: Security Hardening and Adversarial
Testing
- Authors: Jingxuan He and Martin Vechev
- Abstract summary: Large language models (large LMs) are increasingly trained on massive vectors and used to generate code.
This work studies the security of LMs along two important axes: (i) security hardening, which aims to enhance LMs' reliability in generating secure code, and (ii) adversarial testing, which seeks to evaluate LMs' security at an adversarial standpoint.
- Score: 7.315482472726556
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (large LMs) are increasingly trained on massive
codebases and used to generate code. However, LMs lack awareness of security
and are found to frequently produce unsafe code. This work studies the security
of LMs along two important axes: (i) security hardening, which aims to enhance
LMs' reliability in generating secure code, and (ii) adversarial testing, which
seeks to evaluate LMs' security at an adversarial standpoint. We address both
of these by formulating a new security task called controlled code generation.
The task is parametric and takes as input a binary property to guide the LM to
generate secure or unsafe code, while preserving the LM's capability of
generating functionally correct code. We propose a novel learning-based
approach called SVEN to solve this task. SVEN leverages property-specific
continuous vectors to guide program generation towards the given property,
without modifying the LM's weights. Our training procedure optimizes these
continuous vectors by enforcing specialized loss terms on different regions of
code, using a high-quality dataset carefully curated by us. Our extensive
evaluation shows that SVEN is highly effective in achieving strong security
control. For instance, a state-of-the-art CodeGen LM with 2.7B parameters
generates secure code for 59.1% of the time. When we employ SVEN to perform
security hardening (or adversarial testing) on this LM, the ratio is
significantly boosted to 92.3% (or degraded to 36.8%). Importantly, SVEN
closely matches the original LMs in functional correctness.
Related papers
- What Makes and Breaks Safety Fine-tuning? A Mechanistic Study [64.9691741899956]
Safety fine-tuning helps align Large Language Models (LLMs) with human preferences for their safe deployment.
We design a synthetic data generation framework that captures salient aspects of an unsafe input.
Using this, we investigate three well-known safety fine-tuning methods.
arXiv Detail & Related papers (2024-07-14T16:12:57Z) - Constrained Decoding for Secure Code Generation [9.007821185927277]
This paper introduces a new benchmark, CodeGuard+, to measure Code LLMs' ability to generate both secure and correct code.
We show that the state-of-the-art defense technique, prefix tuning, may not be as strong as previously believed, since it generates secure code but sacrifices functional correctness.
We propose new constrained decoding techniques to generate secure code.
arXiv Detail & Related papers (2024-04-30T21:52:19Z) - CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion [117.178835165855]
This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs.
Our studies reveal a new and universal safety vulnerability of these models against code input.
We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization.
arXiv Detail & Related papers (2024-03-12T17:55:38Z) - ROSE Doesn't Do That: Boosting the Safety of Instruction-Tuned Large Language Models with Reverse Prompt Contrastive Decoding [89.0074567748505]
We present reverse prompt contrastive decoding (ROSE), a simple-yet-effective method to boost the safety of existing instruction-tuned LLMs without any additional training.
Experiments on 6 safety and 2 general-purpose tasks show that, our ROSE not only brings consistent and significant safety improvements (up to +13.8% safety score) upon 5 types of instruction-tuned LLMs, but also benefits the general-purpose ability of LLMs.
arXiv Detail & Related papers (2024-02-19T06:58:42Z) - Instruction Tuning for Secure Code Generation [6.043526197249358]
Existing instruction tuning schemes overlook a crucial aspect: the security of generated code.
SafeCoder performs security-centric fine-tuning using a diverse and high-quality dataset.
It is able to drastically improve security (by about 30%) while preserving utility.
arXiv Detail & Related papers (2024-02-14T15:47:46Z) - Code Security Vulnerability Repair Using Reinforcement Learning with
Large Language Models [1.5457286059556397]
We propose a reinforcement learning-based method for security hardening and strengthening of generated code from Large Language Models (LLMs)
In this work, we propose a reinforcement learning-based method for program-specific repair with the combination of semantic and syntactic reward mechanisms that focus heavily on adding security and functional measures in the code, respectively.
arXiv Detail & Related papers (2024-01-13T10:19:26Z) - Generate and Pray: Using SALLMS to Evaluate the Security of LLM Generated Code [0.5137309756089941]
This paper describes SALLM, a framework to benchmark Large Language Models' abilities to generate secure code systematically.
The framework has three major components: a novel dataset of security-centric Python prompts, assessment techniques to evaluate the generated code, and novel metrics to evaluate the models' performance from the perspective of secure code generation.
arXiv Detail & Related papers (2023-11-01T22:46:31Z) - Identifying the Risks of LM Agents with an LM-Emulated Sandbox [68.26587052548287]
Language Model (LM) agents and tools enable a rich set of capabilities but also amplify potential risks.
High cost of testing these agents will make it increasingly difficult to find high-stakes, long-tailed risks.
We introduce ToolEmu: a framework that uses an LM to emulate tool execution and enables the testing of LM agents against a diverse range of tools and scenarios.
arXiv Detail & Related papers (2023-09-25T17:08:02Z) - Online Safety Property Collection and Refinement for Safe Deep
Reinforcement Learning in Mapless Navigation [79.89605349842569]
We introduce the Collection and Refinement of Online Properties (CROP) framework to design properties at training time.
CROP employs a cost signal to identify unsafe interactions and use them to shape safety properties.
We evaluate our approach in several robotic mapless navigation tasks and demonstrate that the violation metric computed with CROP allows higher returns and lower violations over previous Safe DRL approaches.
arXiv Detail & Related papers (2023-02-13T21:19:36Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.