ReSMT: An SMT-Based Tool for Reverse Engineering
- URL: http://arxiv.org/abs/2512.22076v1
- Date: Fri, 26 Dec 2025 16:29:31 GMT
- Title: ReSMT: An SMT-Based Tool for Reverse Engineering
- Authors: Nir Somech, Guy Katz,
- Abstract summary: Software obfuscation techniques make code more difficult to understand, without changing its functionality.<n>Reverse Engineering of obfuscated code is notoriously difficult.<n>We present a novel, automated tool for addressing some of the challenges in reverse engineering of obfuscated code.
- Score: 2.2058293096044586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Software obfuscation techniques make code more difficult to understand, without changing its functionality. Such techniques are often used by authors of malicious software to avoid detection. Reverse Engineering of obfuscated code, i.e., the process of overcoming obfuscation and answering questions about the functionality of the code, is notoriously difficult; and while various tools and methods exist for this purpose, the process remains complex and slow, especially when dealing with layered or customized obfuscation techniques. Here, we present a novel, automated tool for addressing some of the challenges in reverse engineering of obfuscated code. Our tool, called ReSMT, converts the obfuscated assembly code into a complex system of logical assertions that represent the code functionality, and then applies SMT solving and simulation tools to inspect the obfuscated code's execution. The approach is mostly automatic, alleviating the need for highly specialized deobfuscation skills. In an elaborate case study that we conducted, ReSMT successfully tackled complex obfuscated code, and was able to solve reverse-engineering queries about it. We believe that these results showcase the potential and usefulness of our proposed approach.
Related papers
- Context-Guided Decompilation: A Step Towards Re-executability [50.71992919223209]
Binary decompilation plays an important role in software security analysis, reverse engineering and malware understanding.<n>Recent advances in large language models (LLMs) have enabled neural decompilation, but the generated code is typically only semantically plausible.<n>We propose ICL4Decomp, a hybrid decompilation framework that leverages in-context learning (ICL) to guide LLMs toward generating re-executable source code.
arXiv Detail & Related papers (2025-11-03T17:21:39Z) - CASCADE: LLM-Powered JavaScript Deobfuscator at Google [1.7266435334810277]
Software obfuscation, particularly prevalent in JavaScript, hinders code comprehension and analysis.<n>This paper introduces CASCADE, a novel hybrid approach that integrates the advanced coding capabilities of Gemini with the deterministic transformation capabilities of a compiler.<n>CASCADE is already deployed in Google's production environment, demonstrating substantial improvements in JavaScript deobfuscation efficiency.
arXiv Detail & Related papers (2025-07-23T16:57:32Z) - Decompiling Smart Contracts with a Large Language Model [51.49197239479266]
Despite Etherscan's 78,047,845 smart contracts deployed on (as of May 26, 2025), a mere 767,520 ( 1%) are open source.<n>This opacity necessitates the automated semantic analysis of on-chain smart contract bytecode.<n>We introduce a pioneering decompilation pipeline that transforms bytecode into human-readable and semantically faithful Solidity code.
arXiv Detail & Related papers (2025-06-24T13:42:59Z) - An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding [50.17907898478795]
This work proposes a benchmark to evaluate the effectiveness of Large Language Models (LLMs) in real-world reverse engineering scenarios.<n>Our evaluations reveal that existing LLMs can understand binary code to a certain extent, thereby improving the efficiency of binary code analysis.
arXiv Detail & Related papers (2025-04-30T17:02:06Z) - Simplicity by Obfuscation: Evaluating LLM-Driven Code Transformation with Semantic Elasticity [4.458584890504334]
Code obfuscation aims to prevent reverse engineering and intellectual property theft.<n>The recent development of large language models paves the way for practical applications in different domains.<n>This work performs an empirical study on the ability of LLMs to obfuscate Python source code.
arXiv Detail & Related papers (2025-04-18T18:29:23Z) - The Code Barrier: What LLMs Actually Understand? [7.407441962359689]
This research uses code obfuscation as a structured testing framework to evaluate semantic understanding capabilities of language models.<n>Findings show a statistically significant performance decline as obfuscation complexity increases.<n>This research introduces a new evaluation approach for assessing code comprehension in language models.
arXiv Detail & Related papers (2025-04-14T14:11:26Z) - ObfusQate: Unveiling the First Quantum Program Obfuscation Framework [0.0]
ObfusQate is a novel tool that conducts obfuscations using quantum primitives to enhance the security of classical and quantum programs.<n>We have designed and implemented two primary categories of obfuscations: quantum circuit level obfuscation and code level obfuscation.
arXiv Detail & Related papers (2025-03-31T07:02:25Z) - ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models [81.12673534903979]
Tool learning has emerged as a crucial capability for large language models (LLMs) to solve complex real-world tasks through interaction with external tools.<n>We propose ToolCoder, a novel framework that reformulates tool learning as a code generation task.
arXiv Detail & Related papers (2025-02-17T03:42:28Z) - Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation [36.12009987721901]
Malware authors often employ code obfuscations to make their malware harder to detect.<n>Existing tools for generating obfuscated code often require access to the original source code.<n>Can Large Language Models potentially generate a new obfuscated assembly code?<n>If so, this poses a risk to anti-virus engines and potentially increases the flexibility of attackers to create new obfuscation patterns.
arXiv Detail & Related papers (2024-12-20T18:31:24Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - A Transformer-based Approach for Source Code Summarization [86.08359401867577]
We learn code representation for summarization by modeling the pairwise relationship between code tokens.
We show that despite the approach is simple, it outperforms the state-of-the-art techniques by a significant margin.
arXiv Detail & Related papers (2020-05-01T23:29:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.