Unsafe's Betrayal: Abusing Unsafe Rust in Binary Reverse Engineering
toward Finding Memory-safety Bugs via Machine Learning
- URL: http://arxiv.org/abs/2211.00111v1
- Date: Mon, 31 Oct 2022 19:32:18 GMT
- Title: Unsafe's Betrayal: Abusing Unsafe Rust in Binary Reverse Engineering
toward Finding Memory-safety Bugs via Machine Learning
- Authors: Sangdon Park and Xiang Cheng and Taesoo Kim
- Abstract summary: Rust provides memory-safe mechanisms to avoid memory-safety bugs in programming.
Unsafe code that enhances the usability of Rust provides clear spots for finding memory-safety bugs.
We claim that these unsafe spots can still be identifiable in Rust binary code via machine learning.
- Score: 20.68333298047064
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Memory-safety bugs introduce critical software-security issues. Rust provides
memory-safe mechanisms to avoid memory-safety bugs in programming, while still
allowing unsafe escape hatches via unsafe code. However, the unsafe code that
enhances the usability of Rust provides clear spots for finding memory-safety
bugs in Rust source code. In this paper, we claim that these unsafe spots can
still be identifiable in Rust binary code via machine learning and be leveraged
for finding memory-safety bugs. To support our claim, we propose the tool
textttrustspot, that enables reverse engineering to learn an unsafe classifier
that proposes a list of functions in Rust binaries for downstream analysis. We
empirically show that the function proposals by textttrustspot can recall
$92.92\%$ of memory-safety bugs, while it covers only $16.79\%$ of the entire
binary code. As an application, we demonstrate that the function proposals are
used in targeted fuzzing on Rust packages, which contribute to reducing the
fuzzing time compared to non-targeted fuzzing.
Related papers
- BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models [57.5404308854535]
Safety backdoor attacks in large language models (LLMs) enable the stealthy triggering of unsafe behaviors while evading detection during normal interactions.
We present BEEAR, a mitigation approach leveraging the insight that backdoor triggers induce relatively uniform drifts in the model's embedding space.
Our bi-level optimization method identifies universal embedding perturbations that elicit unwanted behaviors and adjusts the model parameters to reinforce safe behaviors against these perturbations.
arXiv Detail & Related papers (2024-06-24T19:29:47Z) - Characterizing Unsafe Code Encapsulation In Real-world Rust Systems [2.285834282327349]
Interior unsafe is an essential design paradigm advocated by the Rust community in system software development.
The Rust compiler is incapable of verifying the soundness of a safe function containing unsafe code.
We propose a novel unsafety isolation graph to model the essential usage and encapsulation of unsafe code.
arXiv Detail & Related papers (2024-06-12T06:59:51Z) - VERT: Verified Equivalent Rust Transpilation with Large Language Models as Few-Shot Learners [6.824327908701066]
Rust is a programming language that combines memory safety and low-level control, providing C-like performance.
Existing work falls into two categories: rule-based and large language model (LLM)-based.
We present VERT, a tool that can produce readable Rust transpilations with formal guarantees of correctness.
arXiv Detail & Related papers (2024-04-29T16:45:03Z) - FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [54.27040631527217]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries.
FoC-BinLLM outperforms ChatGPT by 14.61% on the ROUGE-L score.
FoC-Sim outperforms the previous best methods with a 52% higher Recall@1.
arXiv Detail & Related papers (2024-03-27T09:45:33Z) - Fast Summary-based Whole-program Analysis to Identify Unsafe Memory Accesses in Rust [23.0568924498396]
Rust is one of the most promising systems programming languages to solve the memory safety issues that have plagued low-level software for over forty years.
unsafe Rust code and directly-linked unsafe foreign libraries may not only introduce memory safety violations themselves but also compromise the entire program as they run in the same monolithic address space as the safe Rust.
We have prototyped a whole-program analysis for identifying both unsafe heap allocations and memory accesses to those unsafe heap objects.
arXiv Detail & Related papers (2023-10-16T11:34:21Z) - rCanary: Detecting Memory Leaks Across Semi-automated Memory Management Boundary in Rust [4.616001680122352]
Rust is a system programming language that guarantees memory safety via compile-time verifications.
We present rCanary, a static, non-automated, and fully automated model checker to detect leaks across semiautomated boundary.
arXiv Detail & Related papers (2023-08-09T08:26:04Z) - Is unsafe an Achilles' Heel? A Comprehensive Study of Safety
Requirements in Unsafe Rust Programming [4.981203415693332]
Rust is an emerging, strongly-typed programming language focusing on efficiency and memory safety.
Current unsafe API documents in the standard library exhibited variations, including inconsistency and insufficiency.
To enhance Rust security, we suggest unsafe API documents to list systematic descriptions of safety requirements for users to follow.
arXiv Detail & Related papers (2023-08-09T08:16:10Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - Safe Deep Reinforcement Learning by Verifying Task-Level Properties [84.64203221849648]
Cost functions are commonly employed in Safe Deep Reinforcement Learning (DRL)
The cost is typically encoded as an indicator function due to the difficulty of quantifying the risk of policy decisions in the state space.
In this paper, we investigate an alternative approach that uses domain knowledge to quantify the risk in the proximity of such states by defining a violation metric.
arXiv Detail & Related papers (2023-02-20T15:24:06Z) - Provable Safe Reinforcement Learning with Binary Feedback [62.257383728544006]
We consider the problem of provable safe RL when given access to an offline oracle providing binary feedback on the safety of state, action pairs.
We provide a novel meta algorithm, SABRE, which can be applied to any MDP setting given access to a blackbox PAC RL algorithm for that setting.
arXiv Detail & Related papers (2022-10-26T05:37:51Z) - Sample-Efficient Safety Assurances using Conformal Prediction [57.92013073974406]
Early warning systems can provide alerts when an unsafe situation is imminent.
To reliably improve safety, these warning systems should have a provable false negative rate.
We present a framework that combines a statistical inference technique known as conformal prediction with a simulator of robot/environment dynamics.
arXiv Detail & Related papers (2021-09-28T23:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.