Related papers: An Empirical Study of Code Obfuscation Practices in the Google Play Store

An Empirical Study of Code Obfuscation Practices in the Google Play Store

URL: http://arxiv.org/abs/2502.04636v1
Date: Fri, 07 Feb 2025 03:41:40 GMT
Title: An Empirical Study of Code Obfuscation Practices in the Google Play Store
Authors: Akila Niroshan, Suranga Seneviratne, Aruna Seneviratne,
Abstract summary: We analyze over 500,000 Android APKs from Google Play, spanning an eight-year period.<n>Our results show a 13% increase in obfuscation from 2016 to 2023, with ProGuard and Allatori as the most commonly used tools.<n> obfuscation is more prevalent in top-ranked apps and gaming genres such as Casino apps.
Score: 4.177277588440524
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Android ecosystem is vulnerable to issues such as app repackaging, counterfeiting, and piracy, threatening both developers and users. To mitigate these risks, developers often employ code obfuscation techniques. However, while effective in protecting legitimate applications, obfuscation also hinders security investigations as it is often exploited for malicious purposes. As such, it is important to understand code obfuscation practices in Android apps. In this paper, we analyze over 500,000 Android APKs from Google Play, spanning an eight-year period, to investigate the evolution and prevalence of code obfuscation techniques. First, we propose a set of classifiers to detect obfuscated code, tools, and techniques and then conduct a longitudinal analysis to identify trends. Our results show a 13% increase in obfuscation from 2016 to 2023, with ProGuard and Allatori as the most commonly used tools. We also show that obfuscation is more prevalent in top-ranked apps and gaming genres such as Casino apps. To our knowledge, this is the first large-scale study of obfuscation adoption in the Google Play Store, providing insights for developers and security analysts.

Related papers

Decompiling Smart Contracts with a Large Language Model [51.49197239479266]
Despite Etherscan's 78,047,845 smart contracts deployed on (as of May 26, 2025), a mere 767,520 ( 1%) are open source.<n>This opacity necessitates the automated semantic analysis of on-chain smart contract bytecode.<n>We introduce a pioneering decompilation pipeline that transforms bytecode into human-readable and semantically faithful Solidity code.
arXiv Detail & Related papers (2025-06-24T13:42:59Z)
LLMs Caught in the Crossfire: Malware Requests and Jailbreak Challenges [70.85114705489222]
We propose MalwareBench, a benchmark dataset containing 3,520 jailbreaking prompts for malicious code-generation.<n>M MalwareBench is based on 320 manually crafted malicious code generation requirements, covering 11 jailbreak methods and 29 code functionality categories.<n>Experiments show that mainstream LLMs exhibit limited ability to reject malicious code-generation requirements, and the combination of multiple jailbreak methods further reduces the model's security capabilities.
arXiv Detail & Related papers (2025-06-09T12:02:39Z)
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense [57.86886012610389]
jailbreak attacks exploit vulnerabilities to elicit unintended or harmful outputs.<n>We introduce Layer-AdvPatcher, a novel methodology designed to defend against jailbreak attacks.<n>We conduct extensive experiments on two models, four benchmark datasets, and multiple state-of-the-art jailbreak benchmarks to demonstrate the efficacy of our approach.
arXiv Detail & Related papers (2025-01-05T19:06:03Z)
Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation [36.12009987721901]
Malware authors often employ code obfuscations to make their malware harder to detect.<n>Existing tools for generating obfuscated code often require access to the original source code.<n>Can Large Language Models potentially generate a new obfuscated assembly code?<n>If so, this poses a risk to anti-virus engines and potentially increases the flexibility of attackers to create new obfuscation patterns.
arXiv Detail & Related papers (2024-12-20T18:31:24Z)
A Risk Estimation Study of Native Code Vulnerabilities in Android Applications [1.6078134198754157]
We propose a fast risk-based approach that provides a risk score related to the native part of an Android application. We show that many applications contain well-known vulnerabilities that miscreants can potentially exploit.
arXiv Detail & Related papers (2024-06-04T06:44:07Z)
FV8: A Forced Execution JavaScript Engine for Detecting Evasive Techniques [53.288368877654705]
FV8 is a modified V8 JavaScript engine designed to identify evasion techniques in JavaScript code. It selectively enforces code execution on APIs that conditionally inject dynamic code. It identifies 1,443 npm packages and 164 (82%) extensions containing at least one type of evasion.
arXiv Detail & Related papers (2024-05-21T19:54:19Z)
Understanding crypter-as-a-service in a popular underground marketplace [51.328567400947435]
Crypters are pieces of software whose main goal is to transform a target binary so it can avoid detection from Anti Viruses (AVs) applications. The crypter-as-a-service model has gained popularity, in response to the increased sophistication of detection mechanisms. This paper provides the first study on an online underground market dedicated to crypter-as-a-service.
arXiv Detail & Related papers (2024-05-20T08:35:39Z)
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models [123.66104233291065]
Jailbreak attacks cause large language models (LLMs) to generate harmful, unethical, or otherwise objectionable content. evaluating these attacks presents a number of challenges, which the current collection of benchmarks and evaluation techniques do not adequately address. JailbreakBench is an open-sourced benchmark with the following components.
arXiv Detail & Related papers (2024-03-28T02:44:02Z)
Cryptic Bytes: WebAssembly Obfuscation for Evading Cryptojacking Detection [0.0]
We present the most comprehensive evaluation of code obfuscation techniques for WebAssembly to date. We obfuscate a diverse set of applications, including utilities, games, and crypto miners, using state-of-the-art obfuscation tools like Tigress and wasm-mutate. Our dataset of over 20,000 obfuscated WebAssembly binaries and the emcc-obf tool publicly available to stimulate further research.
arXiv Detail & Related papers (2024-03-22T13:32:08Z)
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models [49.60006012946767]
We propose CodeChameleon, a novel jailbreak framework based on personalized encryption tactics. We conduct extensive experiments on 7 Large Language Models, achieving state-of-the-art average Attack Success Rate (ASR) Remarkably, our method achieves an 86.6% ASR on GPT-4-1106.
arXiv Detail & Related papers (2024-02-26T16:35:59Z)
Weak-to-Strong Jailbreaking on Large Language Models [96.50953637783581]
Large language models (LLMs) are vulnerable to jailbreak attacks. Existing jailbreaking methods are computationally costly. We propose the weak-to-strong jailbreaking attack.
arXiv Detail & Related papers (2024-01-30T18:48:37Z)
AI-based Blackbox Code Deobfuscation: Understand, Improve and Mitigate [0.0]
New field of AI-based blackbox deobfuscation is still in its infancy. New optimized AI-based blackbox deobfuscator Xyntia significantly outperforms prior work in terms of success rate. We propose two novel protections against AI-based blackbox deobfuscation, allowing to counter Xyntia's powerful attacks.
arXiv Detail & Related papers (2021-02-09T12:52:24Z)
Feature-level Malware Obfuscation in Deep Learning [0.0]
We train a deep neural network classifier for malware classification using features of benign and malware samples. We demonstrate a steep increase in false negative rate (i.e., attacks succeed) by randomly adding features of a benign app to malware. We find that for API calls, it is possible to reject the vast majority of attacks, where using Intents or Permissions is less successful.
arXiv Detail & Related papers (2020-02-10T00:47:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.