Related papers: AI-based Blackbox Code Deobfuscation: Understand, Improve and Mitigate

AI-based Blackbox Code Deobfuscation: Understand, Improve and Mitigate

URL: http://arxiv.org/abs/2102.04805v1
Date: Tue, 9 Feb 2021 12:52:24 GMT
Title: AI-based Blackbox Code Deobfuscation: Understand, Improve and Mitigate
Authors: Gr\'egoire Menguy, S\'ebastien Bardin, Richard Bonichon, Cauim de Souza Lima
Abstract summary: New field of AI-based blackbox deobfuscation is still in its infancy. New optimized AI-based blackbox deobfuscator Xyntia significantly outperforms prior work in terms of success rate. We propose two novel protections against AI-based blackbox deobfuscation, allowing to counter Xyntia's powerful attacks.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Code obfuscation aims at protecting Intellectual Property and other secrets embedded into software from being retrieved. Recent works leverage advances in artificial intelligence with the hope of getting blackbox deobfuscators completely immune to standard (whitebox) protection mechanisms. While promising, this new field of AI-based blackbox deobfuscation is still in its infancy. In this article we deepen the state of AI-based blackbox deobfuscation in three key directions: understand the current state-of-the-art, improve over it and design dedicated protection mechanisms. In particular, we define a novel generic framework for AI-based blackbox deobfuscation encompassing prior work and highlighting key components; we are the first to point out that the search space underlying code deobfuscation is too unstable for simulation-based methods (e.g., Monte Carlo Tres Search used in prior work) and advocate the use of robust methods such as S-metaheuritics; we propose the new optimized AI-based blackbox deobfuscator Xyntia which significantly outperforms prior work in terms of success rate (especially with small time budget) while being completely immune to the most recent anti-analysis code obfuscation methods; and finally we propose two novel protections against AI-based blackbox deobfuscation, allowing to counter Xyntia's powerful attacks.

Related papers

ObfusQate: Unveiling the First Quantum Program Obfuscation Framework [0.0]
ObfusQate is a novel tool that conducts obfuscations using quantum primitives to enhance the security of classical and quantum programs. We have designed and implemented two primary categories of obfuscations: quantum circuit level obfuscation and code level obfuscation.
arXiv Detail & Related papers (2025-03-31T07:02:25Z)
ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding [60.37988508851391]
Language models (LMs) have become a staple of the code-writing toolbox. Research exploring modifications to Code-LMs' pre-training objectives, geared towards improving data efficiency and better disentangling between syntax and semantics, has been noticeably sparse. In this work, we examine grounding on obfuscated code as a means of helping Code-LMs look beyond the surface-form syntax and enhance their pre-training sample efficiency.
arXiv Detail & Related papers (2025-03-27T23:08:53Z)
An Empirical Study of Code Obfuscation Practices in the Google Play Store [4.177277588440524]
We analyze over 500,000 Android APKs from Google Play, spanning an eight-year period. Our results show a 13% increase in obfuscation from 2016 to 2023, with ProGuard and Allatori as the most commonly used tools. obfuscation is more prevalent in top-ranked apps and gaming genres such as Casino apps.
arXiv Detail & Related papers (2025-02-07T03:41:40Z)
Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation [36.12009987721901]
Malware authors often employ code obfuscations to make their malware harder to detect. Existing tools for generating obfuscated code often require access to the original source code. Can Large Language Models potentially generate a new obfuscated assembly code? If so, this poses a risk to anti-virus engines and potentially increases the flexibility of attackers to create new obfuscation patterns.
arXiv Detail & Related papers (2024-12-20T18:31:24Z)
CodeCipher: Learning to Obfuscate Source Code Against LLMs [5.872773591957006]
We propose CodeCipher, a novel method that perturbs privacy from code while preserving the original response from LLMs. CodeCipher transforms the LLM's embedding matrix so that each row corresponds to a different word in the original matrix, forming a token-to-token confusion mapping for obfuscating source code. Results show that our model successfully confuses the privacy in source code while preserving the original LLM's performance.
arXiv Detail & Related papers (2024-10-08T08:28:54Z)
Practical Attacks against Black-box Code Completion Engines [5.633172380505533]
We present INSEC, a novel attack that directs code completion engines towards generating vulnerable code. In line with most commercial completion engines, such as GitHub Copilot, INSEC assumes only black-box query access to the targeted engine. Our attack works by inserting a malicious attack string as a short comment in the completion input.
arXiv Detail & Related papers (2024-08-05T14:31:26Z)
Automated Hardware Logic Obfuscation Framework Using GPT [3.1789948141373077]
We introduce Obfus-chat, a novel framework leveraging Generative Pre-trained Transformer (GPT) models to automate the obfuscation process. The proposed framework accepts hardware design netlists and key sizes as inputs, and autonomously generates obfuscated code tailored to enhance security.
arXiv Detail & Related papers (2024-05-20T17:33:00Z)
Understanding crypter-as-a-service in a popular underground marketplace [51.328567400947435]
Crypters are pieces of software whose main goal is to transform a target binary so it can avoid detection from Anti Viruses (AVs) applications. The crypter-as-a-service model has gained popularity, in response to the increased sophistication of detection mechanisms. This paper provides the first study on an online underground market dedicated to crypter-as-a-service.
arXiv Detail & Related papers (2024-05-20T08:35:39Z)
Cryptic Bytes: WebAssembly Obfuscation for Evading Cryptojacking Detection [0.0]
We present the most comprehensive evaluation of code obfuscation techniques for WebAssembly to date. We obfuscate a diverse set of applications, including utilities, games, and crypto miners, using state-of-the-art obfuscation tools like Tigress and wasm-mutate. Our dataset of over 20,000 obfuscated WebAssembly binaries and the emcc-obf tool publicly available to stimulate further research.
arXiv Detail & Related papers (2024-03-22T13:32:08Z)
JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models [53.83273575102087]
We propose an unsupervised inference-time approach to authorship obfuscation. We introduce JAMDEC, a user-controlled, inference-time algorithm for authorship obfuscation. Our approach builds on small language models such as GPT2-XL in order to help avoid disclosing the original content to proprietary LLM's APIs.
arXiv Detail & Related papers (2024-02-13T19:54:29Z)
Baseline Defenses for Adversarial Attacks Against Aligned Language Models [109.75753454188705]
Recent work shows that text moderations can produce jailbreaking prompts that bypass defenses. We look at three types of defenses: detection (perplexity based), input preprocessing (paraphrase and retokenization), and adversarial training. We find that the weakness of existing discretes for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs.
arXiv Detail & Related papers (2023-09-01T17:59:44Z)
How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective [74.47093382436823]
We address the problem of black-box defense: How to robustify a black-box model using just input queries and output feedback? We propose a general notion of defensive operation that can be applied to black-box models, and design it through the lens of denoised smoothing (DS) We empirically show that ZO-AE-DS can achieve improved accuracy, certified robustness, and query complexity over existing baselines.
arXiv Detail & Related papers (2022-03-27T03:23:32Z)
An integrated Auto Encoder-Block Switching defense approach to prevent adversarial attacks [0.0]
The vulnerability of state-of-the-art Neural Networks to adversarial input samples has increased drastically. This article proposes a defense algorithm that utilizes the combination of an auto-encoder and block-switching architecture.
arXiv Detail & Related papers (2022-03-11T10:58:24Z)
Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks [81.13338949407205]
Recent works show that optimal bandit algorithms are vulnerable to adversarial attacks and can fail completely in the presence of attacks. Existing robust bandit algorithms only work for the non-contextual setting under the attack of rewards. We provide the first robust bandit algorithm for linear contextual bandit setting under a fully adaptive and omniscient attack.
arXiv Detail & Related papers (2021-06-05T22:20:34Z)
Adversarial Robustness by Design through Analog Computing and Synthetic Gradients [80.60080084042666]
We propose a new defense mechanism against adversarial attacks inspired by an optical co-processor. In the white-box setting, our defense works by obfuscating the parameters of the random projection. We find the combination of a random projection and binarization in the optical system also improves robustness against various types of black-box attacks.
arXiv Detail & Related papers (2021-01-06T16:15:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.