Certifying Decision Trees Against Evasion Attacks by Program Analysis
- URL: http://arxiv.org/abs/2007.02771v1
- Date: Mon, 6 Jul 2020 14:18:10 GMT
- Title: Certifying Decision Trees Against Evasion Attacks by Program Analysis
- Authors: Stefano Calzavara and Pietro Ferrara and Claudio Lucchese
- Abstract summary: We propose a novel technique to verify the security of machine learning models against evasion attacks.
Our approach exploits the interpretability property of decision trees to transform them into imperative programs.
Our experiments show that our technique is both precise and efficient, yielding only a minimal number of false positives.
- Score: 9.290879387995401
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning has proved invaluable for a range of different tasks, yet it
also proved vulnerable to evasion attacks, i.e., maliciously crafted
perturbations of input data designed to force mispredictions. In this paper we
propose a novel technique to verify the security of decision tree models
against evasion attacks with respect to an expressive threat model, where the
attacker can be represented by an arbitrary imperative program. Our approach
exploits the interpretability property of decision trees to transform them into
imperative programs, which are amenable for traditional program analysis
techniques. By leveraging the abstract interpretation framework, we are able to
soundly verify the security guarantees of decision tree models trained over
publicly available datasets. Our experiments show that our technique is both
precise and efficient, yielding only a minimal number of false positives and
scaling up to cases which are intractable for a competitor approach.
Related papers
- Rigorous Probabilistic Guarantees for Robust Counterfactual Explanations [80.86128012438834]
We show for the first time that computing the robustness of counterfactuals with respect to plausible model shifts is NP-complete.
We propose a novel probabilistic approach which is able to provide tight estimates of robustness with strong guarantees.
arXiv Detail & Related papers (2024-07-10T09:13:11Z) - Attack Tree Generation via Process Mining [0.0]
This work aims to provide a method for the automatic generation of Attack Trees from attack logs.
The main original feature of our approach is the use of Process Mining algorithms to synthesize Attack Trees.
Our approach is supported by a prototype that, apart from the derivation and translation of the model, provides the user with an Attack Tree in the RisQFLan format.
arXiv Detail & Related papers (2024-02-19T10:55:49Z) - Adversarial Attacks Against Uncertainty Quantification [10.655660123083607]
This work focuses on a different adversarial scenario in which the attacker is still interested in manipulating the uncertainty estimate.
In particular, the goal is to undermine the use of machine-learning models when their outputs are consumed by a downstream module or by a human operator.
arXiv Detail & Related papers (2023-09-19T12:54:09Z) - A Unified Evaluation of Textual Backdoor Learning: Frameworks and
Benchmarks [72.7373468905418]
We develop an open-source toolkit OpenBackdoor to foster the implementations and evaluations of textual backdoor learning.
We also propose CUBE, a simple yet strong clustering-based defense baseline.
arXiv Detail & Related papers (2022-06-17T02:29:23Z) - Logically Consistent Adversarial Attacks for Soft Theorem Provers [110.17147570572939]
We propose a generative adversarial framework for probing and improving language models' reasoning capabilities.
Our framework successfully generates adversarial attacks and identifies global weaknesses.
In addition to effective probing, we show that training on the generated samples improves the target model's performance.
arXiv Detail & Related papers (2022-04-29T19:10:12Z) - CC-Cert: A Probabilistic Approach to Certify General Robustness of
Neural Networks [58.29502185344086]
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks.
It is important to provide provable guarantees for deep learning models against semantically meaningful input transformations.
We propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds.
arXiv Detail & Related papers (2021-09-22T12:46:04Z) - Adversarial Attacks for Tabular Data: Application to Fraud Detection and
Imbalanced Data [3.2458203725405976]
Adversarial attacks aim at producing adversarial examples, in other words, slightly modified inputs that induce the AI system to return incorrect outputs.
In this paper we illustrate a novel approach to modify and adapt state-of-the-art algorithms to imbalanced data, in the context of fraud detection.
Experimental results show that the proposed modifications lead to a perfect attack success rate.
When applied to a real-world production system, the proposed techniques shows the possibility of posing a serious threat to the robustness of advanced AI-based fraud detection procedures.
arXiv Detail & Related papers (2021-01-20T08:58:29Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Adversarial Attack and Defense of Structured Prediction Models [58.49290114755019]
In this paper, we investigate attacks and defenses for structured prediction tasks in NLP.
The structured output of structured prediction models is sensitive to small perturbations in the input.
We propose a novel and unified framework that learns to attack a structured prediction model using a sequence-to-sequence model.
arXiv Detail & Related papers (2020-10-04T15:54:03Z) - Luring of transferable adversarial perturbations in the black-box
paradigm [0.0]
We present a new approach to improve the robustness of a model against black-box transfer attacks.
A removable additional neural network is included in the target model, and is designed to induce the textitluring effect.
Our deception-based method only needs to have access to the predictions of the target model and does not require a labeled data set.
arXiv Detail & Related papers (2020-04-10T06:48:36Z) - Feature Partitioning for Robust Tree Ensembles and their Certification
in Adversarial Scenarios [8.300942601020266]
We focus on evasion attacks, where a model is trained in a safe environment and exposed to attacks at test time.
We propose a model-agnostic strategy that builds a robust ensemble by training its basic models on feature-based partitions of the given dataset.
Our algorithm guarantees that the majority of the models in the ensemble cannot be affected by the attacker.
arXiv Detail & Related papers (2020-04-07T12:00:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.