Translating C To Rust: Lessons from a User Study
- URL: http://arxiv.org/abs/2411.14174v1
- Date: Thu, 21 Nov 2024 14:37:05 GMT
- Title: Translating C To Rust: Lessons from a User Study
- Authors: Ruishi Li, Bo Wang, Tianyu Li, Prateek Saxena, Ashish Kundu,
- Abstract summary: Rust aims to offer full memory safety for programs, a guarantee that untamed C programs do not enjoy.
We report on a user study asking humans to translate real-world C programs to Rust.
Our participants are able to produce safe Rust translations, whereas state-of-the-art automatic tools are not able to do so.
- Score: 12.49361031696427
- License:
- Abstract: Rust aims to offer full memory safety for programs, a guarantee that untamed C programs do not enjoy. How difficult is it to translate existing C code to Rust? To get a complementary view from that of automatic C to Rust translators, we report on a user study asking humans to translate real-world C programs to Rust. Our participants are able to produce safe Rust translations, whereas state-of-the-art automatic tools are not able to do so. Our analysis highlights that the high-level strategy taken by users departs significantly from those of automatic tools we study. We also find that users often choose zero-cost (static) abstractions for temporal safety, which addresses a predominant component of runtime costs in other full memory safety defenses. User-provided translations showcase a rich landscape of specialized strategies to translate the same C program in different ways to safe Rust, which future automatic translators can consider.
Related papers
- Context-aware Code Segmentation for C-to-Rust Translation using Large Language Models [1.8416014644193066]
Large language models (LLMs) show promise for automating this translation by generating more natural and safer code than rule-based methods.
We propose an LLM-based translation scheme that improves the success rate of translating large-scale C code into compilable Rust code.
In experiments with 20 benchmark C programs, including those exceeding 4 kilo lines of code, we successfully translated all programs into compilable Rust code.
arXiv Detail & Related papers (2024-09-16T17:52:36Z) - Bringing Rust to Safety-Critical Systems in Space [1.0742675209112622]
Rust aims to drastically reduce the chance of introducing bugs and produces overall more secure and safer code.
This work provides a set of recommendations for the development of safety-critical space systems in Rust.
arXiv Detail & Related papers (2024-05-28T12:48:47Z) - A Study of Undefined Behavior Across Foreign Function Boundaries in Rust Libraries [2.359557447960552]
Rust is frequently used to interoperate with other languages.
Miri is the only dynamic analysis tool capable of validating applications against these models.
Miri does not support foreign functions, indicating that there may be a critical correctness gap at the heart of the Rust ecosystem.
arXiv Detail & Related papers (2024-04-17T18:12:05Z) - Towards a Transpiler for C/C++ to Safer Rust [0.10993800728351737]
Rust is a programming language developed by Mozilla that focuses on performance and safety.
How to convert an existing C++ code base to Rust is also gaining greater attention.
arXiv Detail & Related papers (2024-01-16T10:35:59Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Fixing Rust Compilation Errors using LLMs [2.1781086368581932]
The Rust programming language has established itself as a viable choice for low-level systems programming language over the traditional, unsafe alternatives like C/C++.
This paper presents a tool called RustAssistant that leverages the emergent capabilities of Large Language Models (LLMs) to automatically suggest fixes for Rust compilation errors.
RustAssistant is able to achieve an impressive peak accuracy of roughly 74% on real-world compilation errors in popular open-source Rust repositories.
arXiv Detail & Related papers (2023-08-09T18:30:27Z) - Adapting Language Models to Compress Contexts [71.98287002918941]
Transformer-based language models (LMs) are powerful and widely-applicable tools, but their usefulness is constrained by a finite context window.
We propose to adapt pre-trained LMs into AutoCompressors, which are capable of compressing long contexts into compact summary vectors.
We fine-tune OPT and Llama-2 models on sequences of up to 30,720 tokens and show that AutoCompressors can utilize long contexts to improve perplexity.
arXiv Detail & Related papers (2023-05-24T06:42:44Z) - User-Centric Evaluation of OCR Systems for Kwak'wala [92.73847703011353]
We show that utilizing OCR reduces the time spent in the manual transcription of culturally valuable documents by over 50%.
Our results demonstrate the potential benefits that OCR tools can have on downstream language documentation and revitalization efforts.
arXiv Detail & Related papers (2023-02-26T21:41:15Z) - Safe Deep Reinforcement Learning by Verifying Task-Level Properties [84.64203221849648]
Cost functions are commonly employed in Safe Deep Reinforcement Learning (DRL)
The cost is typically encoded as an indicator function due to the difficulty of quantifying the risk of policy decisions in the state space.
In this paper, we investigate an alternative approach that uses domain knowledge to quantify the risk in the proximity of such states by defining a violation metric.
arXiv Detail & Related papers (2023-02-20T15:24:06Z) - English Contrastive Learning Can Learn Universal Cross-lingual Sentence
Embeddings [77.94885131732119]
Universal cross-lingual sentence embeddings map semantically similar cross-lingual sentences into a shared embedding space.
In this work, we propose mSimCSE, which extends SimCSE to multilingual settings and reveal that contrastive learning on English data can surprisingly learn high-quality universal cross-lingual sentence embeddings without any parallel data.
arXiv Detail & Related papers (2022-11-11T11:17:56Z) - OCR Post Correction for Endangered Language Texts [113.8242302688894]
We create a benchmark dataset of transcriptions for scanned books in three critically endangered languages.
We present a systematic analysis of how general-purpose OCR tools are not robust to the data-scarce setting.
We develop an OCR post-correction method tailored to ease training in this data-scarce setting.
arXiv Detail & Related papers (2020-11-10T21:21:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.