OBsmith: Testing JavaScript Obfuscator using LLM-powered sketching
- URL: http://arxiv.org/abs/2510.10066v1
- Date: Sat, 11 Oct 2025 07:02:42 GMT
- Title: OBsmith: Testing JavaScript Obfuscator using LLM-powered sketching
- Authors: Shan Jiang, Chenguang Zhu, Sarfraz Khurshid,
- Abstract summary: JavaScript obfuscators are widely deployed to protect intellectual property and resist reverse engineering.<n>Existing evaluations measure resistance to deobfuscation, leaving the critical question of whether obfuscators preserve semantics unanswered.<n>We present OBsmith, a novel framework to systematically test JavaScript obfuscators using large language models.
- Score: 11.58496128577643
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: JavaScript obfuscators are widely deployed to protect intellectual property and resist reverse engineering, yet their correctness has been largely overlooked compared to performance and resilience. Existing evaluations typically measure resistance to deobfuscation, leaving the critical question of whether obfuscators preserve program semantics unanswered. Incorrect transformations can silently alter functionality, compromise reliability, and erode security-undermining the very purpose of obfuscation. To address this gap, we present OBsmith, a novel framework to systematically test JavaScript obfuscators using large language models (LLMs). OBsmith leverages LLMs to generate program sketches abstract templates capturing diverse language constructs, idioms, and corner cases-which are instantiated into executable programs and subjected to obfuscation under different configurations. Besides LLM-powered sketching, OBsmith also employs a second source: automatic extraction of sketches from real programs. This extraction path enables more focused testing of project specific features and lets developers inject domain knowledge into the resulting test cases. OBsmith uncovers 11 previously unknown correctness bugs. Under an equal program budget, five general purpose state-of-the-art JavaScript fuzzers (FuzzJIT, Jsfunfuzz, Superion, DIE, Fuzzilli) failed to detect these issues, highlighting OBsmith's complementary focus on obfuscation induced misbehavior. An ablation shows that all components except our generic MRs contribute to at least one bug class; the negative MR result suggests the need for obfuscator-specific metamorphic relations. Our results also seed discussion on how to balance obfuscation presets and performance cost. We envision OBsmith as an important step towards automated testing and quality assurance of obfuscators and other semantic-preserving toolchains.
Related papers
- From Obfuscated to Obvious: A Comprehensive JavaScript Deobfuscation Tool for Security Analysis [6.038443052154118]
JSIMPLIFIER is a comprehensive deobfuscation tool using a multi-stage pipeline with preprocessing.<n>We construct and release the largest real-world obfuscated JavaScript dataset with 44,421 samples.<n>Our results advance benchmarks for JavaScript deobfuscation research and practical security applications.
arXiv Detail & Related papers (2025-12-16T04:13:09Z) - Bag of Tricks for Subverting Reasoning-based Safety Guardrails [62.139297207938036]
We present a bag of jailbreak methods that subvert the reasoning-based guardrails.<n>Our attacks span white-, gray-, and black-box settings and range from effortless template manipulations to fully automated optimization.
arXiv Detail & Related papers (2025-10-13T16:16:44Z) - "Digital Camouflage": The LLVM Challenge in LLM-Based Malware Detection [0.0]
Large Language Models (LLMs) have emerged as promising tools for malware detection.<n>However, their reliability under adversarial compiler-level obfuscation is yet to be discovered.<n>This study empirically evaluate the robustness of three state-of-the-art LLMs against compiler-level obfuscation techniques.
arXiv Detail & Related papers (2025-09-20T12:47:36Z) - JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering [73.962469626788]
Jailbreak attacks against multimodal large language Models (MLLMs) are a significant research focus.<n>We propose JPS, underlineJailbreak MLLMs with collaborative visual underlinePerturbation and textual underlineSteering.
arXiv Detail & Related papers (2025-08-07T07:14:01Z) - CASCADE: LLM-Powered JavaScript Deobfuscator at Google [1.7266435334810277]
Software obfuscation, particularly prevalent in JavaScript, hinders code comprehension and analysis.<n>This paper introduces CASCADE, a novel hybrid approach that integrates the advanced coding capabilities of Gemini with the deterministic transformation capabilities of a compiler.<n>CASCADE is already deployed in Google's production environment, demonstrating substantial improvements in JavaScript deobfuscation efficiency.
arXiv Detail & Related papers (2025-07-23T16:57:32Z) - JsDeObsBench: Measuring and Benchmarking LLMs for JavaScript Deobfuscation [34.88009582470047]
Large Language Models (LLMs) have recently shown promise in automating the deobfuscation process.<n>We present JsDeObsBench, a benchmark designed to rigorously evaluate the effectiveness of LLMs in the context of JS deobfuscation.
arXiv Detail & Related papers (2025-06-25T06:50:13Z) - Decompiling Smart Contracts with a Large Language Model [51.49197239479266]
Despite Etherscan's 78,047,845 smart contracts deployed on (as of May 26, 2025), a mere 767,520 ( 1%) are open source.<n>This opacity necessitates the automated semantic analysis of on-chain smart contract bytecode.<n>We introduce a pioneering decompilation pipeline that transforms bytecode into human-readable and semantically faithful Solidity code.
arXiv Detail & Related papers (2025-06-24T13:42:59Z) - Simplicity by Obfuscation: Evaluating LLM-Driven Code Transformation with Semantic Elasticity [4.458584890504334]
Code obfuscation aims to prevent reverse engineering and intellectual property theft.<n>The recent development of large language models paves the way for practical applications in different domains.<n>This work performs an empirical study on the ability of LLMs to obfuscate Python source code.
arXiv Detail & Related papers (2025-04-18T18:29:23Z) - ObfusQate: Unveiling the First Quantum Program Obfuscation Framework [0.0]
ObfusQate is a novel tool that conducts obfuscations using quantum primitives to enhance the security of classical and quantum programs.<n>We have designed and implemented two primary categories of obfuscations: quantum circuit level obfuscation and code level obfuscation.
arXiv Detail & Related papers (2025-03-31T07:02:25Z) - ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding [60.37988508851391]
Language models (LMs) have become a staple of the code-writing toolbox.<n>Research exploring modifications to Code-LMs' pre-training objectives, geared towards improving data efficiency and better disentangling between syntax and semantics, has been noticeably sparse.<n>In this work, we examine grounding on obfuscated code as a means of helping Code-LMs look beyond the surface-form syntax and enhance their pre-training sample efficiency.
arXiv Detail & Related papers (2025-03-27T23:08:53Z) - ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.<n>This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z) - ShadowCode: Towards (Automatic) External Prompt Injection Attack against Code LLMs [56.46702494338318]
This paper introduces a new attack paradigm: (automatic) external prompt injection against code-oriented large language models.<n>We propose ShadowCode, a simple yet effective method that automatically generates induced perturbations based on code simulation.<n>We evaluate our method across 13 distinct malicious objectives, generating 31 threat cases spanning three popular programming languages.
arXiv Detail & Related papers (2024-07-12T10:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.