A Study of Undefined Behavior Across Foreign Function Boundaries in Rust Libraries
- URL: http://arxiv.org/abs/2404.11671v2
- Date: Thu, 16 May 2024 20:16:54 GMT
- Title: A Study of Undefined Behavior Across Foreign Function Boundaries in Rust Libraries
- Authors: Ian McCormack, Joshua Sunshine, Jonathan Aldrich,
- Abstract summary: Rust is frequently used to interoperate with languages that have far weaker restrictions.
We created MiriLLI, a tool which uses existing Rust and LLVM interpreters to jointly execute multi-language applications.
- Score: 2.359557447960552
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Rust programming language restricts aliasing and mutability to provide static safety guarantees, which developers rely on to write secure and performant applications. However, Rust is frequently used to interoperate with other languages that have far weaker restrictions. These languages support cyclic and self-referential design patterns that conflict with current models of Rust's operational semantics, representing a potentially significant source of undefined behavior that no current tools can detect. We created MiriLLI, a tool which uses existing Rust and LLVM interpreters to jointly execute multi-language Rust applications. We used our tool in a large-scale study of Rust libraries that call foreign functions, and we found 45 instances of undefined or undesirable behavior. These include four bugs from libraries that had over 10,000 daily downloads on average, one from a component of the GNU Compiler Collection (GCC), and one from a library maintained by the Rust Project. Most of these errors were caused by incompatible aliasing and initialization patterns, incorrect foreign function bindings, and invalid type conversion. The majority of aliasing violations were caused by unsound operations in Rust, but they occurred in foreign code. The Rust community must invest in new tools for validating multi-language programs to ensure that developers can easily detect and fix these errors.
Related papers
- CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation [63.23120252801889]
CRUST-Bench is a dataset of 100 C repositories, each paired with manually-written interfaces in safe Rust as well as test cases.
We evaluate state-of-the-art large language models (LLMs) on this task and find that safe and idiomatic Rust generation is still a challenging problem.
The best performing model, OpenAI o1, is able to solve only 15 tasks in a single-shot setting.
arXiv Detail & Related papers (2025-04-21T17:33:33Z) - RustMap: Towards Project-Scale C-to-Rust Migration via Program Analysis and LLM [13.584956125542396]
Rust offers superior memory safety while maintaining C's high performance.
Existing automated translation tools, such as C2Rust, may rely too much on syntactic, template-based translation.
This paper introduces a novel dependency-guided and large language model (LLM)-based C-to-Rust translation approach, RustMap.
arXiv Detail & Related papers (2025-03-22T11:57:45Z) - Commit0: Library Generation from Scratch [77.38414688148006]
Commit0 is a benchmark that challenges AI agents to write libraries from scratch.
Agents are provided with a specification document outlining the library's API as well as a suite of interactive unit tests.
Commit0 also offers an interactive environment where models receive static analysis and execution feedback on the code they generate.
arXiv Detail & Related papers (2024-12-02T18:11:30Z) - Towards Exception Safety Code Generation with Intermediate Representation Agents Framework [54.03528377384397]
Large Language Models (LLMs) often struggle with robust exception handling in generated code, leading to fragile programs that are prone to runtime errors.<n>We propose Seeker, a novel multi-agent framework that enforces exception safety in LLM generated code through an Intermediate Representation (IR) approach.<n>Seeker decomposes exception handling into five specialized agents: Scanner, Detector, Predator, Ranker, and Handler.
arXiv Detail & Related papers (2024-10-09T14:45:45Z) - ChatZero:Zero-shot Cross-Lingual Dialogue Generation via Pseudo-Target Language [53.8622516025736]
We propose a novel end-to-end zero-shot dialogue generation model ChatZero based on cross-lingual code-switching method.
Experiments on the multilingual DailyDialog and DSTC7-AVSD datasets demonstrate that ChatZero can achieve more than 90% of the original performance.
arXiv Detail & Related papers (2024-08-16T13:11:53Z) - VERT: Verified Equivalent Rust Transpilation with Large Language Models as Few-Shot Learners [6.824327908701066]
Rust is a programming language that combines memory safety and low-level control, providing C-like performance.
Existing work falls into two categories: rule-based and large language model (LLM)-based.
We present VERT, a tool that can produce readable Rust transpilations with formal guarantees of correctness.
arXiv Detail & Related papers (2024-04-29T16:45:03Z) - A Mixed-Methods Study on the Implications of Unsafe Rust for Interoperation, Encapsulation, and Tooling [2.2463451968497425]
Rust developers need verification tools that can provide guarantees of soundness within multi-language applications.
We study how developers reason about foreign function calls, the limitations of the tools that they currently use, their motivations for using unsafe code, and how they reason about encapsulating it.
arXiv Detail & Related papers (2024-04-02T18:36:21Z) - Demystifying Compiler Unstable Feature Usage and Impacts in the Rust
Ecosystem [6.742722083947134]
Rust compiler introduces Rust unstable features (RUF) to extend compiler functionality, syntax, and standard library support.
RUF may get removed, introducing compilation failures to dependent packages.
Our study shows that the Rust ecosystem uses 1000 different RUF, and at most 44% of package versions are affected by RUF.
To mitigate wide RUF impacts, we further design and implement a RUF-compilation-failure recovery tool.
arXiv Detail & Related papers (2023-10-26T06:43:25Z) - Yuga: Automatically Detecting Lifetime Annotation Bugs in the Rust Language [15.164423552903571]
Security vulnerabilities have been reported in Rust projects, often attributed to the use of "unsafe" Rust code.
These vulnerabilities, in part, arise from incorrect lifetime annotations on function signatures.
Existing tools fail to detect these bugs, primarily because such bugs are rare, challenging to detect through dynamic analysis.
We devise a novel static analysis tool, Yuga, to detect potential lifetime annotation bugs.
arXiv Detail & Related papers (2023-10-12T17:05:03Z) - Fixing Rust Compilation Errors using LLMs [2.1781086368581932]
The Rust programming language has established itself as a viable choice for low-level systems programming language over the traditional, unsafe alternatives like C/C++.
This paper presents a tool called RustAssistant that leverages the emergent capabilities of Large Language Models (LLMs) to automatically suggest fixes for Rust compilation errors.
RustAssistant is able to achieve an impressive peak accuracy of roughly 74% on real-world compilation errors in popular open-source Rust repositories.
arXiv Detail & Related papers (2023-08-09T18:30:27Z) - A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems.
static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models.
We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z) - Augmented Language Models: a Survey [55.965967655575454]
This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools.
We refer to them as Augmented Language Models (ALMs)
The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks.
arXiv Detail & Related papers (2023-02-15T18:25:52Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization.
We provide a general benchmark with a diversity of real and synthetic Java bugs.
We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z) - Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of
Language Models [86.02610674750345]
Adversarial GLUE (AdvGLUE) is a new multi-task benchmark to explore and evaluate the vulnerabilities of modern large-scale language models under various types of adversarial attacks.
We apply 14 adversarial attack methods to GLUE tasks to construct AdvGLUE, which is further validated by humans for reliable annotations.
All the language models and robust training methods we tested perform poorly on AdvGLUE, with scores lagging far behind the benign accuracy.
arXiv Detail & Related papers (2021-11-04T12:59:55Z) - Zero-Shot Cross-lingual Semantic Parsing [56.95036511882921]
We study cross-lingual semantic parsing as a zero-shot problem without parallel data for 7 test languages.
We propose a multi-task encoder-decoder model to transfer parsing knowledge to additional languages using only English-Logical form paired data.
Our system frames zero-shot parsing as a latent-space alignment problem and finds that pre-trained models can be improved to generate logical forms with minimal cross-lingual transfer penalty.
arXiv Detail & Related papers (2021-04-15T16:08:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.