Galapagos: Automated N-Version Programming with LLMs
- URL: http://arxiv.org/abs/2408.09536v2
- Date: Mon, 13 Jan 2025 15:25:37 GMT
- Title: Galapagos: Automated N-Version Programming with LLMs
- Authors: Javier Ron, Diogo Gaspar, Javier Cabrera-Arteaga, Benoit Baudry, Martin Monperrus,
- Abstract summary: We propose the automated generation of program variants using large language models.<n>We design, develop and evaluate Gal'apagos: a tool for generating program variants.<n>We evaluate Gal'apagos by creating N-Version components of real-world C code.
- Score: 10.573037638807024
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: N-Version Programming is a well-known methodology for developing fault-tolerant systems. It achieves fault detection and correction at runtime by adding diverse redundancy into programs, minimizing fault mode overlap between redundant program variants. In this work, we propose the automated generation of program variants using large language models. We design, develop and evaluate Gal\'apagos: a tool for generating program variants using LLMs, validating their correctness and equivalence, and using them to assemble N-Version binaries. We evaluate Gal\'apagos by creating N-Version components of real-world C code. Our original results show that Gal\'apagos can produce program variants that are proven to be functionally equivalent, even when the variants are written in a different programming language. Our systematic diversity measurement indicates that functionally equivalent variants produced by Gal\'apagos, are statically different after compilation, and present diverging internal behavior at runtime. We demonstrate that the variants produced by Gal\'apagos can protect C code against real miscompilation bugs which affect the Clang compiler. Overall, our paper shows that producing N-Version software can be drastically automated by advanced usage of practical formal verification and generative language models.
Related papers
- DESIL: Detecting Silent Bugs in MLIR Compiler Infrastructure [8.760618981915016]
DESIL enables silent bug detection by defining a set of UB-elimination rules based on the MLIR documentation.
DESIL incorporates the differential testing for silent bug detection.
It detected 23 silent bugs and 19 crash bugs, of which 12/14 have been confirmed or fixed.
arXiv Detail & Related papers (2025-04-02T05:48:51Z) - Synthetic Programming Elicitation for Text-to-Code in Very Low-Resource Programming and Formal Languages [21.18996339478024]
We introduce emphsynthetic programming elicitation and compilation (SPEAC)
SPEAC produces syntactically correct programs more frequently and without sacrificing semantic correctness.
We empirically evaluate the performance of SPEAC in a case study for the UCLID5 formal verification language.
arXiv Detail & Related papers (2024-06-05T22:16:19Z) - CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
We propose CodeGRAG, a Graphical Retrieval Augmented Code Generation framework to enhance the performance of LLMs.
CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to fill the gap between programming languages and natural language.
Various experiments and ablations are done on four datasets including both the C++ and python languages to validate the hard meta-graph prompt, the soft prompting technique, and the effectiveness of the objectives for pretrained GNN expert.
arXiv Detail & Related papers (2024-05-03T02:48:55Z) - NExT: Teaching Large Language Models to Reason about Code Execution [50.93581376646064]
Large language models (LLMs) of code are typically trained on the surface textual form of programs.
We propose NExT, a method to teach LLMs to inspect the execution traces of programs and reason about their run-time behavior.
arXiv Detail & Related papers (2024-04-23T01:46:32Z) - Evolutionary Generative Fuzzing for Differential Testing of the Kotlin
Compiler [14.259471945857431]
We investigate the effectiveness of differential testing in finding bugs within the Kotlin compilers developed at JetBrains.
We propose a black-box generative approach that creates input programs for the K1 and K2 compilers.
Our case study shows that the proposed approach effectively detects bugs in K1 and K2; these bugs have been confirmed and (some) fixed by JetBrains developers.
arXiv Detail & Related papers (2024-01-12T16:01:12Z) - AdaCCD: Adaptive Semantic Contrasts Discovery Based Cross Lingual
Adaptation for Code Clone Detection [69.79627042058048]
AdaCCD is a novel cross-lingual adaptation method that can detect cloned codes in a new language without annotations in that language.
We evaluate the cross-lingual adaptation results of AdaCCD by constructing a multilingual code clone detection benchmark consisting of 5 programming languages.
arXiv Detail & Related papers (2023-11-13T12:20:48Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Multilingual Code Co-Evolution Using Large Language Models [45.083171710527985]
Translating code changes from one programming language to another is not the way developers work.
Codeditor explicitly models code changes as edit and learns to correlate changes across programming languages.
Codeditor outperforms the state-of-the-art approaches by a large margin on all commonly used automatic metrics.
arXiv Detail & Related papers (2023-07-27T16:37:30Z) - SLaDe: A Portable Small Language Model Decompiler for Optimized Assembly [6.080751346188323]
This paper presents SLaDe, a Small Language model Decompiler based on a sequence-to-sequence transformer trained over real-world code.
We utilize type-inference to generate programs that are more readable and accurate than standard analytic and recent neural approaches.
arXiv Detail & Related papers (2023-05-21T17:31:39Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it.
Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z) - Natural Language to Code Translation with Execution [82.52142893010563]
Execution result--minimum Bayes risk decoding for program selection.
We show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
arXiv Detail & Related papers (2022-04-25T06:06:08Z) - Compilable Neural Code Generation with Compiler Feedback [43.97362484564799]
This paper proposes a three-stage pipeline for compilable code generation, including language model fine-tuning, compilability reinforcement, and compilability discrimination.
Experiments on two code generation tasks demonstrate the effectiveness of our proposed approach, improving the success rate of compilation from 44.18 to 89.18 on average and from 70.3 to 96.2 in text-to-code generation, respectively.
arXiv Detail & Related papers (2022-03-10T03:15:17Z) - AVATAR: A Parallel Corpus for Java-Python Program Translation [77.86173793901139]
Program translation refers to migrating source code from one language to another.
We present AVATAR, a collection of 9,515 programming problems and their solutions written in two popular languages, Java and Python.
arXiv Detail & Related papers (2021-08-26T05:44:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.