Decompiling Rust: An Empirical Study of Compiler Optimizations and Reverse Engineering Challenges
- URL: http://arxiv.org/abs/2507.18792v1
- Date: Thu, 24 Jul 2025 20:26:42 GMT
- Title: Decompiling Rust: An Empirical Study of Compiler Optimizations and Reverse Engineering Challenges
- Authors: Zixu Zhou,
- Abstract summary: Decompiling Rust binaries is challenging due to the language's rich type system, aggressive compiler optimizations, and widespread use of high-level abstractions.<n>We conduct a benchmark-driven evaluation of decompilation quality across core Rust features and compiler build modes.<n>Our findings provide actionable insights for tool developers and highlight the need for Rust-aware decompilation strategies.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Decompiling Rust binaries is challenging due to the language's rich type system, aggressive compiler optimizations, and widespread use of high-level abstractions. In this work, we conduct a benchmark-driven evaluation of decompilation quality across core Rust features and compiler build modes. Our automated scoring framework shows that generic types, trait methods, and error handling constructs significantly reduce decompilation quality, especially in release builds. Through representative case studies, we analyze how specific language constructs affect control flow, variable naming, and type information recovery. Our findings provide actionable insights for tool developers and highlight the need for Rust-aware decompilation strategies.
Related papers
- CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation [63.23120252801889]
CRUST-Bench is a dataset of 100 C repositories, each paired with manually-written interfaces in safe Rust as well as test cases.<n>We evaluate state-of-the-art large language models (LLMs) on this task and find that safe and idiomatic Rust generation is still a challenging problem.<n>The best performing model, OpenAI o1, is able to solve only 15 tasks in a single-shot setting.
arXiv Detail & Related papers (2025-04-21T17:33:33Z) - Beyond the Edge of Function: Unraveling the Patterns of Type Recovery in Binary Code [55.493408628371235]
We propose ByteTR, a framework for recovering variable types in binary code.<n>In light of the ubiquity of variable propagation across functions, ByteTR conducts inter-procedural analysis to trace variable propagation and employs a gated graph neural network to capture long-range data flow dependencies for variable type recovery.
arXiv Detail & Related papers (2025-03-10T12:27:05Z) - ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.<n>This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z) - ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models [81.12673534903979]
Tool learning has emerged as a crucial capability for large language models (LLMs) to solve complex real-world tasks through interaction with external tools.<n>We propose ToolCoder, a novel framework that reformulates tool learning as a code generation task.
arXiv Detail & Related papers (2025-02-17T03:42:28Z) - StructTest: Benchmarking LLMs' Reasoning through Compositional Structured Outputs [78.84060166851805]
StructTest is a novel benchmark that evaluates large language models (LLMs) on their ability to follow compositional instructions and generate structured outputs.<n> Assessments are conducted deterministically using a rule-based evaluator, which can be easily extended to new tasks and datasets.<n>We demonstrate that StructTest remains challenging even for top-performing models like Deepseek-V3/R1 and GPT-4o.
arXiv Detail & Related papers (2024-12-23T22:08:40Z) - CompilerDream: Learning a Compiler World Model for General Code Optimization [58.87557583347996]
We introduce CompilerDream, a model-based reinforcement learning approach to general code optimization.<n>It comprises a compiler world model that accurately simulates the intrinsic properties of optimization passes and an agent trained on this model to produce effective optimization strategies.<n>It excels across diverse datasets, surpassing LLVM's built-in optimizations and other state-of-the-art methods in both settings of value prediction and end-to-end code optimization.
arXiv Detail & Related papers (2024-04-24T09:20:33Z) - A Study of Undefined Behavior Across Foreign Function Boundaries in Rust Libraries [2.359557447960552]
Rust is frequently used to interoperate with other languages.<n>Miri is the only dynamic analysis tool that can validate applications against these models.<n>Miri does not support finding bugs in foreign functions, indicating that there may be a critical correctness gap across the Rust ecosystem.
arXiv Detail & Related papers (2024-04-17T18:12:05Z) - Accurate Coverage Metrics for Compiler-Generated Debugging Information [0.0]
Many tools rely on compiler-produced metadata to present a source-language view of program states.
Current approaches for measuring the extent of coverage of local variables are based on crude assumptions.
We propose some new metrics, computable by our tools, which could serve as motivation for language implementations to improve debug quality.
arXiv Detail & Related papers (2024-02-07T13:01:28Z) - Towards a Transpiler for C/C++ to Safer Rust [0.10993800728351737]
Rust is a programming language developed by Mozilla that focuses on performance and safety.
How to convert an existing C++ code base to Rust is also gaining greater attention.
arXiv Detail & Related papers (2024-01-16T10:35:59Z) - Fixing Rust Compilation Errors using LLMs [2.1781086368581932]
The Rust programming language has established itself as a viable choice for low-level systems programming language over the traditional, unsafe alternatives like C/C++.
This paper presents a tool called RustAssistant that leverages the emergent capabilities of Large Language Models (LLMs) to automatically suggest fixes for Rust compilation errors.
RustAssistant is able to achieve an impressive peak accuracy of roughly 74% on real-world compilation errors in popular open-source Rust repositories.
arXiv Detail & Related papers (2023-08-09T18:30:27Z) - Improving Long-Horizon Imitation Through Instruction Prediction [93.47416552953075]
In this work, we explore the use of an often unused source of auxiliary supervision: language.
Inspired by recent advances in transformer-based models, we train agents with an instruction prediction loss that encourages learning temporally extended representations that operate at a high level of abstraction.
In further analysis we find that instruction modeling is most important for tasks that require complex reasoning, while understandably offering smaller gains in environments that require simple plans.
arXiv Detail & Related papers (2023-06-21T20:47:23Z) - A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems.
static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models.
We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z) - Beyond the C: Retargetable Decompilation using Neural Machine
Translation [5.734661402742406]
We develop a prototype decompiler that is easily retargetable to new languages.
We examine the impact of parameters such as tokenization and training data selection on the quality of decompilation.
We will release our training data, trained decompilation models, and code to help encourage future research into language-agnostic decompilation.
arXiv Detail & Related papers (2022-12-17T20:45:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.