Related papers: rCanary: Detecting Memory Leaks Across Semi-automated Memory Management Boundary in Rust

rCanary: Detecting Memory Leaks Across Semi-automated Memory Management Boundary in Rust

URL: http://arxiv.org/abs/2308.04787v2
Date: Thu, 1 Aug 2024 09:02:04 GMT
Title: rCanary: Detecting Memory Leaks Across Semi-automated Memory Management Boundary in Rust
Authors: Mohan Cui, Hui Xu, Hongliang Tian, Yangfan Zhou,
Abstract summary: Rust is a system programming language that guarantees memory safety via compile-time verifications. We present rCanary, a static, non-automated, and fully automated model checker to detect leaks across semiautomated boundary.
Score: 4.616001680122352
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Rust is an effective system programming language that guarantees memory safety via compile-time verifications. It employs a novel ownership-based resource management model to facilitate automated deallocation. This model is anticipated to eliminate memory leaks. However, we observed that user intervention drives it into semi-automated memory management and makes it error-prone to cause leaks. In contrast to violating memory-safety guarantees restricted by the unsafe keyword, the boundary of leaking memory is implicit, and the compiler would not emit any warnings for developers. In this paper, we present rCanary, a static, non-intrusive, and fully automated model checker to detect leaks across the semiautomated boundary. We design an encoder to abstract data with heap allocation and formalize a refined leak-free memory model based on boolean satisfiability. It can generate SMT-Lib2 format constraints for Rust MIR and is implemented as a Cargo component. We evaluate rCanary by using flawed package benchmarks collected from the pull requests of open-source Rust projects. The results indicate that it is possible to recall all these defects with acceptable false positives. We further apply our tool to more than 1,200 real-world crates from crates.io and GitHub, identifying 19 crates having memory leaks. Our analyzer is also efficient, that costs 8.4 seconds per package.

Related papers

CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation [63.23120252801889]
CRUST-Bench is a dataset of 100 C repositories, each paired with manually-written interfaces in safe Rust as well as test cases. We evaluate state-of-the-art large language models (LLMs) on this task and find that safe and idiomatic Rust generation is still a challenging problem. The best performing model, OpenAI o1, is able to solve only 15 tasks in a single-shot setting.
arXiv Detail & Related papers (2025-04-21T17:33:33Z)
LeakGuard: Detecting Memory Leaks Accurately and Scalably [3.256598917442277]
LeakGuard is a memory leak detection tool which provides satisfactory balance of accuracy and scalability. For accuracy, LeakGuard analyzes the behaviors of library and developer-defined memory allocation and deallocation functions. For scalability, LeakGuard examines each function of interest independently by using its function summary and under-constrained symbolic execution technique.
arXiv Detail & Related papers (2025-04-06T09:11:37Z)
ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages. This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z)
AddressWatcher: Sanitizer-Based Localization of Memory Leak Fixes [6.31619298702529]
Memory leak bugs are a major problem in C/C++ programs. Several techniques have been proposed to automatically fix memory leaks. Static-based approaches attempt to trace the complete semantics of memory object across all paths. Dynamic approaches can spell out precise semantics of memory object only on a single execution path.
arXiv Detail & Related papers (2024-08-08T21:40:22Z)
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [54.27040631527217]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries. We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language. We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z)
Fast Summary-based Whole-program Analysis to Identify Unsafe Memory Accesses in Rust [23.0568924498396]
Rust is one of the most promising systems programming languages to solve the memory safety issues that have plagued low-level software for over forty years. unsafe Rust code and directly-linked unsafe foreign libraries may not only introduce memory safety violations themselves but also compromise the entire program as they run in the same monolithic address space as the safe Rust. We have prototyped a whole-program analysis for identifying both unsafe heap allocations and memory accesses to those unsafe heap objects.
arXiv Detail & Related papers (2023-10-16T11:34:21Z)
Yuga: Automatically Detecting Lifetime Annotation Bugs in the Rust Language [15.164423552903571]
Security vulnerabilities have been reported in Rust projects, often attributed to the use of "unsafe" Rust code. These vulnerabilities, in part, arise from incorrect lifetime annotations on function signatures. Existing tools fail to detect these bugs, primarily because such bugs are rare, challenging to detect through dynamic analysis. We devise a novel static analysis tool, Yuga, to detect potential lifetime annotation bugs.
arXiv Detail & Related papers (2023-10-12T17:05:03Z)
Fixing Rust Compilation Errors using LLMs [2.1781086368581932]
The Rust programming language has established itself as a viable choice for low-level systems programming language over the traditional, unsafe alternatives like C/C++. This paper presents a tool called RustAssistant that leverages the emergent capabilities of Large Language Models (LLMs) to automatically suggest fixes for Rust compilation errors. RustAssistant is able to achieve an impressive peak accuracy of roughly 74% on real-world compilation errors in popular open-source Rust repositories.
arXiv Detail & Related papers (2023-08-09T18:30:27Z)
Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory [72.36736686941671]
We propose a novel framework, selfmem, for improving retrieval-augmented generation models. Selfmem iteratively employs a retrieval-augmented generator to create an unbounded memory pool and using a memory selector to choose one output as memory for the subsequent generation round. We evaluate the effectiveness of selfmem on three distinct text generation tasks.
arXiv Detail & Related papers (2023-05-03T21:40:54Z)
Unsafe's Betrayal: Abusing Unsafe Rust in Binary Reverse Engineering toward Finding Memory-safety Bugs via Machine Learning [20.68333298047064]
Rust provides memory-safe mechanisms to avoid memory-safety bugs in programming. Unsafe code that enhances the usability of Rust provides clear spots for finding memory-safety bugs. We claim that these unsafe spots can still be identifiable in Rust binary code via machine learning.
arXiv Detail & Related papers (2022-10-31T19:32:18Z)
Recurrent Dynamic Embedding for Video Object Segmentation [54.52527157232795]
We propose a Recurrent Dynamic Embedding (RDE) to build a memory bank of constant size. We propose an unbiased guidance loss during the training stage, which makes SAM more robust in long videos. We also design a novel self-correction strategy so that the network can repair the embeddings of masks with different qualities in the memory bank.
arXiv Detail & Related papers (2022-05-08T02:24:43Z)
LaMemo: Language Modeling with Look-Ahead Memory [50.6248714811912]
We propose Look-Ahead Memory (LaMemo) that enhances the recurrence memory by incrementally attending to the right-side tokens. LaMemo embraces bi-directional attention and segment recurrence with an additional overhead only linearly proportional to the memory length. Experiments on widely used language modeling benchmarks demonstrate its superiority over the baselines equipped with different types of memory.
arXiv Detail & Related papers (2022-04-15T06:11:25Z)
Kanerva++: extending The Kanerva Machine with differentiable, locally block allocated latent memory [75.65949969000596]
Episodic and semantic memory are critical components of the human memory model. We develop a new principled Bayesian memory allocation scheme that bridges the gap between episodic and semantic memory. We demonstrate that this allocation scheme improves performance in memory conditional image generation.
arXiv Detail & Related papers (2021-02-20T18:40:40Z)
DMV: Visual Object Tracking via Part-level Dense Memory and Voting-based Retrieval [61.366644088881735]
We propose a novel memory-based tracker via part-level dense memory and voting-based retrieval, called DMV. We also propose a novel voting mechanism for the memory reading to filter out unreliable information in the memory.
arXiv Detail & Related papers (2020-03-20T10:05:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.