Feature Partitioning for Robust Tree Ensembles and their Certification
in Adversarial Scenarios
- URL: http://arxiv.org/abs/2004.03295v1
- Date: Tue, 7 Apr 2020 12:00:40 GMT
- Title: Feature Partitioning for Robust Tree Ensembles and their Certification
in Adversarial Scenarios
- Authors: Stefano Calzavara, Claudio Lucchese, Federico Marcuzzi, Salvatore
Orlando
- Abstract summary: We focus on evasion attacks, where a model is trained in a safe environment and exposed to attacks at test time.
We propose a model-agnostic strategy that builds a robust ensemble by training its basic models on feature-based partitions of the given dataset.
Our algorithm guarantees that the majority of the models in the ensemble cannot be affected by the attacker.
- Score: 8.300942601020266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning algorithms, however effective, are known to be vulnerable in
adversarial scenarios where a malicious user may inject manipulated instances.
In this work we focus on evasion attacks, where a model is trained in a safe
environment and exposed to attacks at test time. The attacker aims at finding a
minimal perturbation of a test instance that changes the model outcome.
We propose a model-agnostic strategy that builds a robust ensemble by
training its basic models on feature-based partitions of the given dataset. Our
algorithm guarantees that the majority of the models in the ensemble cannot be
affected by the attacker. We experimented the proposed strategy on decision
tree ensembles, and we also propose an approximate certification method for
tree ensembles that efficiently assess the minimal accuracy of a forest on a
given dataset avoiding the costly computation of evasion attacks.
Experimental evaluation on publicly available datasets shows that proposed
strategy outperforms state-of-the-art adversarial learning algorithms against
evasion attacks.
Related papers
- Simple Perturbations Subvert Ethereum Phishing Transactions Detection: An Empirical Analysis [12.607077453567594]
We investigate the impact of various adversarial attack strategies on model performance metrics, such as accuracy, precision, recall, and F1-score.
We examine the effectiveness of different mitigation strategies, including adversarial training and enhanced feature selection, in enhancing model robustness.
arXiv Detail & Related papers (2024-08-06T20:40:20Z) - MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - Discriminative Adversarial Unlearning [40.30974185546541]
We introduce a novel machine unlearning framework founded upon the established principles of the min-max optimization paradigm.
We capitalize on the capabilities of strong Membership Inference Attacks (MIA) to facilitate the unlearning of specific samples from a trained model.
Our proposed algorithm closely approximates the ideal benchmark of retraining from scratch for both random sample forgetting and class-wise forgetting schemes.
arXiv Detail & Related papers (2024-02-10T03:04:57Z) - Group-based Robustness: A General Framework for Customized Robustness in
the Real World [16.376584375681812]
We find that conventional metrics measuring targeted and untargeted robustness do not appropriately reflect a model's ability to withstand attacks from one set of source classes to another set of target classes.
We propose a new metric, termed group-based robustness, that complements existing metrics and is better-suited for evaluating model performance in certain attack scenarios.
We show that with comparable success rates, finding evasive samples using our new loss functions saves by a factor as large as the number of targeted classes.
arXiv Detail & Related papers (2023-06-29T01:07:12Z) - A Unified Evaluation of Textual Backdoor Learning: Frameworks and
Benchmarks [72.7373468905418]
We develop an open-source toolkit OpenBackdoor to foster the implementations and evaluations of textual backdoor learning.
We also propose CUBE, a simple yet strong clustering-based defense baseline.
arXiv Detail & Related papers (2022-06-17T02:29:23Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - Enhanced Membership Inference Attacks against Machine Learning Models [9.26208227402571]
Membership inference attacks are used to quantify the private information that a model leaks about the individual data points in its training set.
We derive new attack algorithms that can achieve a high AUC score while also highlighting the different factors that affect their performance.
Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models.
arXiv Detail & Related papers (2021-11-18T13:31:22Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - Automated Decision-based Adversarial Attacks [48.01183253407982]
We consider the practical and challenging decision-based black-box adversarial setting.
Under this setting, the attacker can only acquire the final classification labels by querying the target model.
We propose to automatically discover decision-based adversarial attack algorithms.
arXiv Detail & Related papers (2021-05-09T13:15:10Z) - A black-box adversarial attack for poisoning clustering [78.19784577498031]
We propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algorithms.
We show that our attacks are transferable even against supervised algorithms such as SVMs, random forests, and neural networks.
arXiv Detail & Related papers (2020-09-09T18:19:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.