AI-based Blackbox Code Deobfuscation: Understand, Improve and Mitigate
- URL: http://arxiv.org/abs/2102.04805v1
- Date: Tue, 9 Feb 2021 12:52:24 GMT
- Title: AI-based Blackbox Code Deobfuscation: Understand, Improve and Mitigate
- Authors: Gr\'egoire Menguy, S\'ebastien Bardin, Richard Bonichon, Cauim de
Souza Lima
- Abstract summary: New field of AI-based blackbox deobfuscation is still in its infancy.
New optimized AI-based blackbox deobfuscator Xyntia significantly outperforms prior work in terms of success rate.
We propose two novel protections against AI-based blackbox deobfuscation, allowing to counter Xyntia's powerful attacks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Code obfuscation aims at protecting Intellectual Property and other secrets
embedded into software from being retrieved. Recent works leverage advances in
artificial intelligence with the hope of getting blackbox deobfuscators
completely immune to standard (whitebox) protection mechanisms. While
promising, this new field of AI-based blackbox deobfuscation is still in its
infancy. In this article we deepen the state of AI-based blackbox deobfuscation
in three key directions: understand the current state-of-the-art, improve over
it and design dedicated protection mechanisms. In particular, we define a novel
generic framework for AI-based blackbox deobfuscation encompassing prior work
and highlighting key components; we are the first to point out that the search
space underlying code deobfuscation is too unstable for simulation-based
methods (e.g., Monte Carlo Tres Search used in prior work) and advocate the use
of robust methods such as S-metaheuritics; we propose the new optimized
AI-based blackbox deobfuscator Xyntia which significantly outperforms prior
work in terms of success rate (especially with small time budget) while being
completely immune to the most recent anti-analysis code obfuscation methods;
and finally we propose two novel protections against AI-based blackbox
deobfuscation, allowing to counter Xyntia's powerful attacks.
Related papers
- An Empirical Study of Code Obfuscation Practices in the Google Play Store [4.177277588440524]
We analyze over 500,000 Android APKs from Google Play, spanning an eight-year period.
Our results show a 13% increase in obfuscation from 2016 to 2023, with ProGuard and Allatori as the most commonly used tools.
obfuscation is more prevalent in top-ranked apps and gaming genres such as Casino apps.
arXiv Detail & Related papers (2025-02-07T03:41:40Z) - Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation [36.12009987721901]
Malware authors often employ code obfuscations to make their malware harder to detect.
Existing tools for generating obfuscated code often require access to the original source code.
Can Large Language Models potentially generate a new obfuscated assembly code?
If so, this poses a risk to anti-virus engines and potentially increases the flexibility of attackers to create new obfuscation patterns.
arXiv Detail & Related papers (2024-12-20T18:31:24Z) - CodeCipher: Learning to Obfuscate Source Code Against LLMs [5.872773591957006]
We propose CodeCipher, a novel method that perturbs privacy from code while preserving the original response from LLMs.
CodeCipher transforms the LLM's embedding matrix so that each row corresponds to a different word in the original matrix, forming a token-to-token confusion mapping for obfuscating source code.
Results show that our model successfully confuses the privacy in source code while preserving the original LLM's performance.
arXiv Detail & Related papers (2024-10-08T08:28:54Z) - Automated Hardware Logic Obfuscation Framework Using GPT [3.1789948141373077]
We introduce Obfus-chat, a novel framework leveraging Generative Pre-trained Transformer (GPT) models to automate the obfuscation process.
The proposed framework accepts hardware design netlists and key sizes as inputs, and autonomously generates obfuscated code tailored to enhance security.
arXiv Detail & Related papers (2024-05-20T17:33:00Z) - Understanding crypter-as-a-service in a popular underground marketplace [51.328567400947435]
Crypters are pieces of software whose main goal is to transform a target binary so it can avoid detection from Anti Viruses (AVs) applications.
The crypter-as-a-service model has gained popularity, in response to the increased sophistication of detection mechanisms.
This paper provides the first study on an online underground market dedicated to crypter-as-a-service.
arXiv Detail & Related papers (2024-05-20T08:35:39Z) - Cryptic Bytes: WebAssembly Obfuscation for Evading Cryptojacking Detection [0.0]
We present the most comprehensive evaluation of code obfuscation techniques for WebAssembly to date.
We obfuscate a diverse set of applications, including utilities, games, and crypto miners, using state-of-the-art obfuscation tools like Tigress and wasm-mutate.
Our dataset of over 20,000 obfuscated WebAssembly binaries and the emcc-obf tool publicly available to stimulate further research.
arXiv Detail & Related papers (2024-03-22T13:32:08Z) - JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding
over Small Language Models [53.83273575102087]
We propose an unsupervised inference-time approach to authorship obfuscation.
We introduce JAMDEC, a user-controlled, inference-time algorithm for authorship obfuscation.
Our approach builds on small language models such as GPT2-XL in order to help avoid disclosing the original content to proprietary LLM's APIs.
arXiv Detail & Related papers (2024-02-13T19:54:29Z) - Baseline Defenses for Adversarial Attacks Against Aligned Language
Models [109.75753454188705]
Recent work shows that text moderations can produce jailbreaking prompts that bypass defenses.
We look at three types of defenses: detection (perplexity based), input preprocessing (paraphrase and retokenization), and adversarial training.
We find that the weakness of existing discretes for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs.
arXiv Detail & Related papers (2023-09-01T17:59:44Z) - How to Robustify Black-Box ML Models? A Zeroth-Order Optimization
Perspective [74.47093382436823]
We address the problem of black-box defense: How to robustify a black-box model using just input queries and output feedback?
We propose a general notion of defensive operation that can be applied to black-box models, and design it through the lens of denoised smoothing (DS)
We empirically show that ZO-AE-DS can achieve improved accuracy, certified robustness, and query complexity over existing baselines.
arXiv Detail & Related papers (2022-03-27T03:23:32Z) - Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks [81.13338949407205]
Recent works show that optimal bandit algorithms are vulnerable to adversarial attacks and can fail completely in the presence of attacks.
Existing robust bandit algorithms only work for the non-contextual setting under the attack of rewards.
We provide the first robust bandit algorithm for linear contextual bandit setting under a fully adaptive and omniscient attack.
arXiv Detail & Related papers (2021-06-05T22:20:34Z) - Adversarial Robustness by Design through Analog Computing and Synthetic
Gradients [80.60080084042666]
We propose a new defense mechanism against adversarial attacks inspired by an optical co-processor.
In the white-box setting, our defense works by obfuscating the parameters of the random projection.
We find the combination of a random projection and binarization in the optical system also improves robustness against various types of black-box attacks.
arXiv Detail & Related papers (2021-01-06T16:15:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.