Enabling Memory Safety of C Programs using LLMs
- URL: http://arxiv.org/abs/2404.01096v1
- Date: Mon, 1 Apr 2024 13:05:54 GMT
- Title: Enabling Memory Safety of C Programs using LLMs
- Authors: Nausheen Mohammed, Akash Lal, Aseem Rastogi, Subhajit Roy, Rahul Sharma,
- Abstract summary: Memory safety violations in low-level code, written in languages like C, continue to remain one of the major sources of software vulnerabilities.
One method of removing such violations by construction is to port C code to a safe C dialect.
Such dialects rely on programmer-supplied annotations to guarantee safety with minimal runtime overhead.
This porting is a manual process that imposes significant burden on the programmer and hence, there has been limited adoption of this technique.
- Score: 5.297072277460838
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Memory safety violations in low-level code, written in languages like C, continues to remain one of the major sources of software vulnerabilities. One method of removing such violations by construction is to port C code to a safe C dialect. Such dialects rely on programmer-supplied annotations to guarantee safety with minimal runtime overhead. This porting, however, is a manual process that imposes significant burden on the programmer and, hence, there has been limited adoption of this technique. The task of porting not only requires inferring annotations, but may also need refactoring/rewriting of the code to make it amenable to such annotations. In this paper, we use Large Language Models (LLMs) towards addressing both these concerns. We show how to harness LLM capabilities to do complex code reasoning as well as rewriting of large codebases. We also present a novel framework for whole-program transformations that leverages lightweight static analysis to break the transformation into smaller steps that can be carried out effectively by an LLM. We implement our ideas in a tool called MSA that targets the CheckedC dialect. We evaluate MSA on several micro-benchmarks, as well as real-world code ranging up to 20K lines of code. We showcase superior performance compared to a vanilla LLM baseline, as well as demonstrate improvement over a state-of-the-art symbolic (non-LLM) technique.
Related papers
- Context-aware Code Segmentation for C-to-Rust Translation using Large Language Models [1.8416014644193066]
Large language models (LLMs) show promise for automating this translation by generating more natural and safer code than rule-based methods.
We propose an LLM-based translation scheme that improves the success rate of translating large-scale C code into compilable Rust code.
In experiments with 20 benchmark C programs, including those exceeding 4 kilo lines of code, we successfully translated all programs into compilable Rust code.
arXiv Detail & Related papers (2024-09-16T17:52:36Z) - HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data [60.75578581719921]
Large language models (LLMs) have shown great potential for automatic code generation.
Recent studies highlight that many LLM-generated code contains serious security vulnerabilities.
We introduce HexaCoder, a novel approach to enhance the ability of LLMs to generate secure codes.
arXiv Detail & Related papers (2024-09-10T12:01:43Z) - What can Large Language Models Capture about Code Functional Equivalence? [24.178831487657945]
We introduce SeqCoBench, a benchmark for assessing how Code-LLMs can capture code functional equivalence.
We conduct evaluations on state-of-the-art (Code-)LLMs to see if they can discern semantically equivalent or different pairs of programs in SeqCoBench.
arXiv Detail & Related papers (2024-08-20T11:19:06Z) - Assured LLM-Based Software Engineering [51.003878077888686]
This paper is an outline of the content of the keynote by Mark Harman at the International Workshop on Interpretability, Robustness, and Benchmarking in Neural Software Engineering, Monday 15th April 2024, Lisbon, Portugal.
arXiv Detail & Related papers (2024-02-06T20:38:46Z) - Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs [65.2379940117181]
We introduce code prompting, a chain of prompts that transforms a natural language problem into code.
We find that code prompting exhibits a high-performance boost for multiple LLMs.
Our analysis of GPT 3.5 reveals that the code formatting of the input problem is essential for performance improvement.
arXiv Detail & Related papers (2024-01-18T15:32:24Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - DeceptPrompt: Exploiting LLM-driven Code Generation via Adversarial
Natural Language Instructions [27.489622263456983]
We introduce DeceptPrompt, an algorithm that can generate adversarial natural language instructions that drive the Code LLMs to generate functionality correct code with vulnerabilities.
When applying the optimized prefix/suffix, the attack success rate (ASR) will improve by average 50% compared with no prefix/suffix applying.
arXiv Detail & Related papers (2023-12-07T22:19:06Z) - SALLM: Security Assessment of Generated Code [0.5137309756089941]
This paper describes SALLM, a framework to benchmark Large Language Models' abilities to generate secure code systematically.
The framework has three major components: a novel dataset of security-centric Python prompts, assessment techniques to evaluate the generated code, and novel metrics to evaluate the models' performance from the perspective of secure code generation.
arXiv Detail & Related papers (2023-11-01T22:46:31Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Exploring Continual Learning for Code Generation Models [80.78036093054855]
Continual Learning (CL) is an important aspect that remains underexplored in the code domain.
We introduce a benchmark called CodeTask-CL that covers a wide range of tasks, including code generation, translation, summarization, and refinement.
We find that effective methods like Prompt Pooling (PP) suffer from catastrophic forgetting due to the unstable training of the prompt selection mechanism.
arXiv Detail & Related papers (2023-07-05T16:58:39Z) - LLMSecEval: A Dataset of Natural Language Prompts for Security
Evaluations [4.276841620787673]
Large Language Models (LLMs) like Codex are powerful tools for performing code completion and code generation tasks.
These models are capable of generating code snippets from Natural Language (NL) descriptions by learning languages and programming practices from public GitHub repositories.
Although LLMs promise an effortless NL-driven deployment of software applications, the security of the code they generate has not been extensively investigated nor documented.
arXiv Detail & Related papers (2023-03-16T15:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.