JSidentify-V2: Leveraging Dynamic Memory Fingerprinting for Mini-Game Plagiarism Detection
- URL: http://arxiv.org/abs/2508.01655v1
- Date: Sun, 03 Aug 2025 08:26:13 GMT
- Title: JSidentify-V2: Leveraging Dynamic Memory Fingerprinting for Mini-Game Plagiarism Detection
- Authors: Zhihao Li, Chaozheng Wang, Zongjie Li, Xinyong Peng, Qun Xia, Haochuan Lu, Ting Xiong, Shuzheng Gao, Cuiyun Gao, Shuai Wang, Yuetang Deng, Huafeng Ma,
- Abstract summary: JSidentify-V2 is a novel dynamic analysis framework that detects mini-game plagiarism.<n> JSidentify-V2 captures memory invariants during program execution.<n>We evaluate JSidentify-V2 against eight obfuscation methods on a dataset of 1,200 mini-games.
- Score: 18.504553340594462
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The explosive growth of mini-game platforms has led to widespread code plagiarism, where malicious users access popular games' source code and republish them with modifications. While existing static analysis tools can detect simple obfuscation techniques like variable renaming and dead code injection, they fail against sophisticated deep obfuscation methods such as encrypted code with local or cloud-based decryption keys that completely destroy code structure and render traditional Abstract Syntax Tree analysis ineffective. To address these challenges, we present JSidentify-V2, a novel dynamic analysis framework that detects mini-game plagiarism by capturing memory invariants during program execution. Our key insight is that while obfuscation can severely distort static code characteristics, runtime memory behavior patterns remain relatively stable. JSidentify-V2 employs a four-stage pipeline: (1) static pre-analysis and instrumentation to identify potential memory invariants, (2) adaptive hot object slicing to maximize execution coverage of critical code segments, (3) Memory Dependency Graph construction to represent behavioral fingerprints resilient to obfuscation, and (4) graph-based similarity analysis for plagiarism detection. We evaluate JSidentify-V2 against eight obfuscation methods on a comprehensive dataset of 1,200 mini-games ...
Related papers
- Disappearing Ink: Obfuscation Breaks N-gram Code Watermarks in Theory and Practice [23.788321123219244]
Distinguishing AI-generated code from human-written code is crucial for authorship attribution, content tracking, and misuse detection.<n>N-gram-based watermarking schemes have emerged as prominent, which inject secret watermarks to be detected during the generation.<n>Most claims rely solely on defenses against simple code transformations or code optimizations as a simulation of attack, creating a questionable sense of robustness.
arXiv Detail & Related papers (2025-07-07T22:18:19Z) - Decompiling Smart Contracts with a Large Language Model [51.49197239479266]
Despite Etherscan's 78,047,845 smart contracts deployed on (as of May 26, 2025), a mere 767,520 ( 1%) are open source.<n>This opacity necessitates the automated semantic analysis of on-chain smart contract bytecode.<n>We introduce a pioneering decompilation pipeline that transforms bytecode into human-readable and semantically faithful Solidity code.
arXiv Detail & Related papers (2025-06-24T13:42:59Z) - Mechanistic Interpretability in the Presence of Architectural Obfuscation [0.0]
Architectural obfuscation is a lightweight substitute for heavyweight cryptography in privacy-preserving large-language-model (LLM) inference.<n>We analyze a GPT-2-small model trained from scratch with a representative obfuscation map.<n>Our findings reveal that obfuscation dramatically alters activation patterns within attention heads yet preserves the layer-wise computational graph.
arXiv Detail & Related papers (2025-06-22T14:39:16Z) - Dynamic Graph-based Fingerprinting of In-browser Cryptomining [0.5261718469769449]
cryptojacking is an attack that uses stolen computing resources to mine cryptocurrencies without consent for profit.<n>In-browser cryptojacking malware exploits web technologies like WebAssembly to mine cryptocurrencies directly within the browser.<n>We propose using instruction-level data-flow graphs to detect cryptomining behavior.
arXiv Detail & Related papers (2025-05-05T09:21:58Z) - Identifying Obfuscated Code through Graph-Based Semantic Analysis of Binary Code [5.181058136007981]
This paper investigates the problem of function-level obfuscation detection using graph-based approaches.<n>We consider various obfuscation types and obfuscators, resulting in two complex datasets.<n>Our approach shows satisfactory results, especially in a challenging 11-class classification task and in a practical malware analysis example.
arXiv Detail & Related papers (2025-04-02T08:36:27Z) - Enhancing Malware Fingerprinting through Analysis of Evasive Techniques [15.037069167445846]
We analyze 4 million Windows Portable Executable (PE) files, 21 million sections, and 48 million resources.<n>We find up to 80% deep structural similarities, including common APIs and executable sections.<n>Our analysis reveals non-functional mutations, such as altered section numbers, virtual sizes, and section names, as primary evasion tactics.
arXiv Detail & Related papers (2025-03-09T07:41:49Z) - ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.<n>This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z) - FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [51.898805184427545]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries.<n>We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language.<n>We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z) - Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models.
We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data.
We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level.
Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.