Related papers: Generating Adversarial Computer Programs using Optimized Obfuscations

Generating Adversarial Computer Programs using Optimized Obfuscations

URL: http://arxiv.org/abs/2103.11882v1
Date: Thu, 18 Mar 2021 10:47:15 GMT
Title: Generating Adversarial Computer Programs using Optimized Obfuscations
Authors: Shashank Srikant, Sijia Liu, Tamara Mitrovska, Shiyu Chang, Quanfu Fan, Gaoyuan Zhang, Una-May O'Reilly
Abstract summary: We investigate principled ways to adversarially perturb a computer program to fool such learned models. We use program obfuscations, which have conventionally been used to avoid attempts at reverse engineering programs. We show that our best attack proposal achieves a $52%$ improvement over a state-of-the-art attack generation approach.
Score: 43.95037234252815
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning (ML) models that learn and predict properties of computer programs are increasingly being adopted and deployed. These models have demonstrated success in applications such as auto-completing code, summarizing large programs, and detecting bugs and malware in programs. In this work, we investigate principled ways to adversarially perturb a computer program to fool such learned models, and thus determine their adversarial robustness. We use program obfuscations, which have conventionally been used to avoid attempts at reverse engineering programs, as adversarial perturbations. These perturbations modify programs in ways that do not alter their functionality but can be crafted to deceive an ML model when making a decision. We provide a general formulation for an adversarial program that allows applying multiple obfuscation transformations to a program in any language. We develop first-order optimization algorithms to efficiently determine two key aspects -- which parts of the program to transform, and what transformations to use. We show that it is important to optimize both these aspects to generate the best adversarially perturbed program. Due to the discrete nature of this problem, we also propose using randomized smoothing to improve the attack loss landscape to ease optimization. We evaluate our work on Python and Java programs on the problem of program summarization. We show that our best attack proposal achieves a $52\%$ improvement over a state-of-the-art attack generation approach for programs trained on a seq2seq model. We further show that our formulation is better at training models that are robust to adversarial attacks.

Related papers

Efficient Symbolic Execution of Software under Fault Attacks [6.333695701603308]
Fault attacks leverage physically injected hardware faults to break the safety of a software program. Existing methods for analyzing the impact of faults on software suffer from inaccurate fault modeling and inefficient analysis algorithms. We propose a fault modeling technique that leverages program transformation to add symbolic variables to the program, to accurately model the fault-induced program behavior. Second, we propose a redundancy pruning technique that leverages the weakest precondition and fault saturation to mitigate path explosion.
arXiv Detail & Related papers (2025-03-20T03:19:48Z)
Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation [23.31928097405939]
We use a language-model-infused scaffolding program to improve itself. A variety of self-improvement strategies are proposed by the language model. It demonstrates that a modern language model, GPT-4, is capable of writing code that can call itself to improve itself.
arXiv Detail & Related papers (2023-10-03T17:59:32Z)
Learning to compile smartly for program size reduction [35.58682369218936]
We propose a novel approach that learns a policy to select passes for program size reduction. Our approach uses a search mechanism that helps identify useful pass sequences and a GNN with customized attention that selects the optimal sequence to use.
arXiv Detail & Related papers (2023-01-09T16:42:35Z)
Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it. Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z)
Learning from Self-Sampled Correct and Partially-Correct Programs [96.66452896657991]
We propose to let the model perform sampling during training and learn from both self-sampled fully-correct programs and partially-correct programs. We show that our use of self-sampled correct and partially-correct programs can benefit learning and help guide the sampling process. Our proposed method improves the pass@k performance by 3.1% to 12.3% compared to learning from a single reference program with MLE.
arXiv Detail & Related papers (2022-05-28T03:31:07Z)
Natural Language to Code Translation with Execution [82.52142893010563]
Execution result--minimum Bayes risk decoding for program selection. We show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
arXiv Detail & Related papers (2022-04-25T06:06:08Z)
Weighted Programming [0.0]
We study weighted programming, a programming paradigm for specifying mathematical models. We argue that weighted programming as a paradigm can be used to specify mathematical models beyond probability distributions.
arXiv Detail & Related papers (2022-02-15T17:06:43Z)
Programming with Neural Surrogates of Programs [17.259433118432757]
We study three surrogate-based design patterns, evaluating each in case studies on a large-scale CPU simulator. With surrogate compilation, programmers develop a surrogate that mimics the behavior of a program to deploy to end-users. With surrogate adaptation, programmers develop a surrogate of a program then retrain that surrogate on a different task. With surrogate optimization, programmers develop a surrogate of a program, optimize input parameters of the surrogate, then plug the optimized input parameters back into the original program.
arXiv Detail & Related papers (2021-12-12T04:45:41Z)
Searching for More Efficient Dynamic Programs [61.79535031840558]
We describe a set of program transformations, a simple metric for assessing the efficiency of a transformed program, and a search procedure to improve this metric. We show that in practice, automated search can find substantial improvements to the initial program.
arXiv Detail & Related papers (2021-09-14T20:52:55Z)
Covert Model Poisoning Against Federated Learning: Algorithm Design and Optimization [76.51980153902774]
Federated learning (FL) is vulnerable to external attacks on FL models during parameters transmissions. In this paper, we propose effective MP algorithms to combat state-of-the-art defensive aggregation mechanisms. Our experimental results demonstrate that the proposed CMP algorithms are effective and substantially outperform existing attack mechanisms.
arXiv Detail & Related papers (2021-01-28T03:28:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.