Generating Adversarial Computer Programs using Optimized Obfuscations
- URL: http://arxiv.org/abs/2103.11882v1
- Date: Thu, 18 Mar 2021 10:47:15 GMT
- Title: Generating Adversarial Computer Programs using Optimized Obfuscations
- Authors: Shashank Srikant, Sijia Liu, Tamara Mitrovska, Shiyu Chang, Quanfu
Fan, Gaoyuan Zhang, Una-May O'Reilly
- Abstract summary: We investigate principled ways to adversarially perturb a computer program to fool such learned models.
We use program obfuscations, which have conventionally been used to avoid attempts at reverse engineering programs.
We show that our best attack proposal achieves a $52%$ improvement over a state-of-the-art attack generation approach.
- Score: 43.95037234252815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) models that learn and predict properties of computer
programs are increasingly being adopted and deployed. These models have
demonstrated success in applications such as auto-completing code, summarizing
large programs, and detecting bugs and malware in programs. In this work, we
investigate principled ways to adversarially perturb a computer program to fool
such learned models, and thus determine their adversarial robustness. We use
program obfuscations, which have conventionally been used to avoid attempts at
reverse engineering programs, as adversarial perturbations. These perturbations
modify programs in ways that do not alter their functionality but can be
crafted to deceive an ML model when making a decision. We provide a general
formulation for an adversarial program that allows applying multiple
obfuscation transformations to a program in any language. We develop
first-order optimization algorithms to efficiently determine two key aspects --
which parts of the program to transform, and what transformations to use. We
show that it is important to optimize both these aspects to generate the best
adversarially perturbed program. Due to the discrete nature of this problem, we
also propose using randomized smoothing to improve the attack loss landscape to
ease optimization. We evaluate our work on Python and Java programs on the
problem of program summarization. We show that our best attack proposal
achieves a $52\%$ improvement over a state-of-the-art attack generation
approach for programs trained on a seq2seq model. We further show that our
formulation is better at training models that are robust to adversarial
attacks.
Related papers
- Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation [23.31928097405939]
We use a language-model-infused scaffolding program to improve itself.
A variety of self-improvement strategies are proposed by the language model.
It demonstrates that a modern language model, GPT-4, is capable of writing code that can call itself to improve itself.
arXiv Detail & Related papers (2023-10-03T17:59:32Z) - Learning to compile smartly for program size reduction [35.58682369218936]
We propose a novel approach that learns a policy to select passes for program size reduction.
Our approach uses a search mechanism that helps identify useful pass sequences and a GNN with customized attention that selects the optimal sequence to use.
arXiv Detail & Related papers (2023-01-09T16:42:35Z) - Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it.
Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z) - Learning from Self-Sampled Correct and Partially-Correct Programs [96.66452896657991]
We propose to let the model perform sampling during training and learn from both self-sampled fully-correct programs and partially-correct programs.
We show that our use of self-sampled correct and partially-correct programs can benefit learning and help guide the sampling process.
Our proposed method improves the pass@k performance by 3.1% to 12.3% compared to learning from a single reference program with MLE.
arXiv Detail & Related papers (2022-05-28T03:31:07Z) - Natural Language to Code Translation with Execution [82.52142893010563]
Execution result--minimum Bayes risk decoding for program selection.
We show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
arXiv Detail & Related papers (2022-04-25T06:06:08Z) - Weighted Programming [0.0]
We study weighted programming, a programming paradigm for specifying mathematical models.
We argue that weighted programming as a paradigm can be used to specify mathematical models beyond probability distributions.
arXiv Detail & Related papers (2022-02-15T17:06:43Z) - Programming with Neural Surrogates of Programs [17.259433118432757]
We study three surrogate-based design patterns, evaluating each in case studies on a large-scale CPU simulator.
With surrogate compilation, programmers develop a surrogate that mimics the behavior of a program to deploy to end-users.
With surrogate adaptation, programmers develop a surrogate of a program then retrain that surrogate on a different task.
With surrogate optimization, programmers develop a surrogate of a program, optimize input parameters of the surrogate, then plug the optimized input parameters back into the original program.
arXiv Detail & Related papers (2021-12-12T04:45:41Z) - Searching for More Efficient Dynamic Programs [61.79535031840558]
We describe a set of program transformations, a simple metric for assessing the efficiency of a transformed program, and a search procedure to improve this metric.
We show that in practice, automated search can find substantial improvements to the initial program.
arXiv Detail & Related papers (2021-09-14T20:52:55Z) - Covert Model Poisoning Against Federated Learning: Algorithm Design and
Optimization [76.51980153902774]
Federated learning (FL) is vulnerable to external attacks on FL models during parameters transmissions.
In this paper, we propose effective MP algorithms to combat state-of-the-art defensive aggregation mechanisms.
Our experimental results demonstrate that the proposed CMP algorithms are effective and substantially outperform existing attack mechanisms.
arXiv Detail & Related papers (2021-01-28T03:28:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.