Related papers: Fixing Rust Compilation Errors using LLMs

Fixing Rust Compilation Errors using LLMs

URL: http://arxiv.org/abs/2308.05177v1
Date: Wed, 9 Aug 2023 18:30:27 GMT
Title: Fixing Rust Compilation Errors using LLMs
Authors: Pantazis Deligiannis, Akash Lal, Nikita Mehrotra, Aseem Rastogi
Abstract summary: The Rust programming language has established itself as a viable choice for low-level systems programming language over the traditional, unsafe alternatives like C/C++. This paper presents a tool called RustAssistant that leverages the emergent capabilities of Large Language Models (LLMs) to automatically suggest fixes for Rust compilation errors. RustAssistant is able to achieve an impressive peak accuracy of roughly 74% on real-world compilation errors in popular open-source Rust repositories.
Score: 2.1781086368581932
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Rust programming language, with its safety guarantees, has established itself as a viable choice for low-level systems programming language over the traditional, unsafe alternatives like C/C++. These guarantees come from a strong ownership-based type system, as well as primitive support for features like closures, pattern matching, etc., that make the code more concise and amenable to reasoning. These unique Rust features also pose a steep learning curve for programmers. This paper presents a tool called RustAssistant that leverages the emergent capabilities of Large Language Models (LLMs) to automatically suggest fixes for Rust compilation errors. RustAssistant uses a careful combination of prompting techniques as well as iteration with an LLM to deliver high accuracy of fixes. RustAssistant is able to achieve an impressive peak accuracy of roughly 74% on real-world compilation errors in popular open-source Rust repositories. We plan to release our dataset of Rust compilation errors to enable further research.

Related papers

EVOC2RUST: A Skeleton-guided Framework for Project-Level C-to-Rust Translation [16.12483934561206]
EvoC2Rust is an automated framework for converting entire C projects to equivalent Rust ones.<n>Our evaluation on open-source benchmarks and six industrial projects demonstrates EvoC2Rust's superior performance in project-level C-to-Rust translation.
arXiv Detail & Related papers (2025-08-06T10:31:23Z)
CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation [63.23120252801889]
CRUST-Bench is a dataset of 100 C repositories, each paired with manually-written interfaces in safe Rust as well as test cases. We evaluate state-of-the-art large language models (LLMs) on this task and find that safe and idiomatic Rust generation is still a challenging problem. The best performing model, OpenAI o1, is able to solve only 15 tasks in a single-shot setting.
arXiv Detail & Related papers (2025-04-21T17:33:33Z)
RustMap: Towards Project-Scale C-to-Rust Migration via Program Analysis and LLM [13.584956125542396]
Rust offers superior memory safety while maintaining C's high performance. Existing automated translation tools, such as C2Rust, may rely too much on syntactic, template-based translation. This paper introduces a novel dependency-guided and large language model (LLM)-based C-to-Rust translation approach, RustMap.
arXiv Detail & Related papers (2025-03-22T11:57:45Z)
ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages. This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z)
C2SaferRust: Transforming C Projects into Safer Rust with NeuroSymbolic Techniques [16.111888466557144]
We present C2SaferRust, a novel approach to translate C to Rust. We first use C2Rust to convert C code to non-idiomatic, unsafe Rust. We decompose the unsafe Rust code into slices that can be individually translated to safer Rust by an LLM. After processing each slice, we run end-to-end test cases to verify that the code still functions as expected.
arXiv Detail & Related papers (2025-01-24T05:53:07Z)
Translating C To Rust: Lessons from a User Study [12.49361031696427]
Rust aims to offer full memory safety for programs, a guarantee that untamed C programs do not enjoy. We report on a user study asking humans to translate real-world C programs to Rust. Our participants are able to produce safe Rust translations, whereas state-of-the-art automatic tools are not able to do so.
arXiv Detail & Related papers (2024-11-21T14:37:05Z)
Context-aware Code Segmentation for C-to-Rust Translation using Large Language Models [1.8416014644193066]
Large language models (LLMs) show promise for automating this translation by generating more natural and safer code than rule-based methods. We propose an LLM-based translation scheme that improves the success rate of translating large-scale C code into compilable Rust code. In experiments with 20 benchmark C programs, including those exceeding 4 kilo lines of code, we successfully translated all programs into compilable Rust code.
arXiv Detail & Related papers (2024-09-16T17:52:36Z)
Investigating the Transferability of Code Repair for Low-Resource Programming Languages [57.62712191540067]
Large language models (LLMs) have shown remarkable performance on code generation tasks. Recent works augment the code repair process by integrating modern techniques such as chain-of-thought reasoning or distillation. We investigate the benefits of distilling code repair for both high and low resource languages.
arXiv Detail & Related papers (2024-06-21T05:05:39Z)
VERT: Verified Equivalent Rust Transpilation with Large Language Models as Few-Shot Learners [6.824327908701066]
Rust is a programming language that combines memory safety and low-level control, providing C-like performance. Existing work falls into two categories: rule-based and large language model (LLM)-based. We present VERT, a tool that can produce readable Rust transpilations with formal guarantees of correctness.
arXiv Detail & Related papers (2024-04-29T16:45:03Z)
A Study of Undefined Behavior Across Foreign Function Boundaries in Rust Libraries [2.359557447960552]
Rust is frequently used to interoperate with other languages. Miri is the only dynamic analysis tool capable of validating applications against these models. Miri does not support foreign functions, indicating that there may be a critical correctness gap at the heart of the Rust ecosystem.
arXiv Detail & Related papers (2024-04-17T18:12:05Z)
ReGAL: Refactoring Programs to Discover Generalizable Abstractions [59.05769810380928]
Generalizable Abstraction Learning (ReGAL) is a method for learning a library of reusable functions via codeization. We find that the shared function libraries discovered by ReGAL make programs easier to predict across diverse domains. For CodeLlama-13B, ReGAL results in absolute accuracy increases of 11.5% on LOGO, 26.1% on date understanding, and 8.1% on TextCraft, outperforming GPT-3.5 in two of three domains.
arXiv Detail & Related papers (2024-01-29T18:45:30Z)
Demystifying Compiler Unstable Feature Usage and Impacts in the Rust Ecosystem [6.742722083947134]
Rust compiler introduces Rust unstable features (RUF) to extend compiler functionality, syntax, and standard library support. RUF may get removed, introducing compilation failures to dependent packages. Our study shows that the Rust ecosystem uses 1000 different RUF, and at most 44% of package versions are affected by RUF. To mitigate wide RUF impacts, we further design and implement a RUF-compilation-failure recovery tool.
arXiv Detail & Related papers (2023-10-26T06:43:25Z)
Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts. Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness. Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z)
A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems. static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models. We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z)
Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents. Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages. We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z)
LM-Critic: Language Models for Unsupervised Grammatical Error Correction [128.9174409251852]
We show how to leverage a pretrained language model (LM) in defining an LM-Critic, which judges a sentence to be grammatical. We apply this LM-Critic and BIFI along with a large set of unlabeled sentences to bootstrap realistic ungrammatical / grammatical pairs for training a corrector.
arXiv Detail & Related papers (2021-09-14T17:06:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.