On the Robustness of Random Forest Against Untargeted Data Poisoning: An
Ensemble-Based Approach
- URL: http://arxiv.org/abs/2209.14013v3
- Date: Mon, 28 Aug 2023 07:32:23 GMT
- Title: On the Robustness of Random Forest Against Untargeted Data Poisoning: An
Ensemble-Based Approach
- Authors: Marco Anisetti, Claudio A. Ardagna, Alessandro Balestrucci, Nicola
Bena, Ernesto Damiani, Chan Yeob Yeun
- Abstract summary: In machine learning models, perturbations of fractions of the training set (poisoning) can seriously undermine the model accuracy.
This paper aims to implement a novel hash-based ensemble approach that protects random forest against untargeted, random poisoning attacks.
- Score: 42.81632484264218
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning is becoming ubiquitous. From finance to medicine, machine
learning models are boosting decision-making processes and even outperforming
humans in some tasks. This huge progress in terms of prediction quality does
not however find a counterpart in the security of such models and corresponding
predictions, where perturbations of fractions of the training set (poisoning)
can seriously undermine the model accuracy. Research on poisoning attacks and
defenses received increasing attention in the last decade, leading to several
promising solutions aiming to increase the robustness of machine learning.
Among them, ensemble-based defenses, where different models are trained on
portions of the training set and their predictions are then aggregated, provide
strong theoretical guarantees at the price of a linear overhead. Surprisingly,
ensemble-based defenses, which do not pose any restrictions on the base model,
have not been applied to increase the robustness of random forest models. The
work in this paper aims to fill in this gap by designing and implementing a
novel hash-based ensemble approach that protects random forest against
untargeted, random poisoning attacks. An extensive experimental evaluation
measures the performance of our approach against a variety of attacks, as well
as its sustainability in terms of resource consumption and performance, and
compares it with a traditional monolithic model based on random forest. A final
discussion presents our main findings and compares our approach with existing
poisoning defenses targeting random forests.
Related papers
- MIBench: A Comprehensive Benchmark for Model Inversion Attack and Defense [43.71365087852274]
Model Inversion (MI) attacks aim at leveraging the output information of target models to reconstruct privacy-sensitive training data.
The lack of a comprehensive, aligned, and reliable benchmark has emerged as a formidable challenge.
We introduce the first practical benchmark for model inversion attacks and defenses to address this critical gap, which is named textitMIBench
arXiv Detail & Related papers (2024-10-07T16:13:49Z) - A Unified Evaluation of Textual Backdoor Learning: Frameworks and
Benchmarks [72.7373468905418]
We develop an open-source toolkit OpenBackdoor to foster the implementations and evaluations of textual backdoor learning.
We also propose CUBE, a simple yet strong clustering-based defense baseline.
arXiv Detail & Related papers (2022-06-17T02:29:23Z) - Scalable Whitebox Attacks on Tree-based Models [2.3186641356561646]
This paper proposes a novel whitebox adversarial robustness testing approach for tree ensemble models.
By leveraging sampling and the log-derivative trick, the proposed approach can scale up to testing tasks that were previously unmanageable.
arXiv Detail & Related papers (2022-03-31T21:36:20Z) - Self-Ensemble Adversarial Training for Improved Robustness [14.244311026737666]
Adversarial training is the strongest strategy against various adversarial attacks among all sorts of defense methods.
Recent works mainly focus on developing new loss functions or regularizers, attempting to find the unique optimal point in the weight space.
We devise a simple but powerful emphSelf-Ensemble Adversarial Training (SEAT) method for yielding a robust classifier by averaging weights of history models.
arXiv Detail & Related papers (2022-03-18T01:12:18Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Voting based ensemble improves robustness of defensive models [82.70303474487105]
We study whether it is possible to create an ensemble to further improve robustness.
By ensembling several state-of-the-art pre-trained defense models, our method can achieve a 59.8% robust accuracy.
arXiv Detail & Related papers (2020-11-28T00:08:45Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Adversarial Attack and Defense of Structured Prediction Models [58.49290114755019]
In this paper, we investigate attacks and defenses for structured prediction tasks in NLP.
The structured output of structured prediction models is sensitive to small perturbations in the input.
We propose a novel and unified framework that learns to attack a structured prediction model using a sequence-to-sequence model.
arXiv Detail & Related papers (2020-10-04T15:54:03Z) - Poisoning Attacks on Algorithmic Fairness [14.213638219685656]
We introduce an optimization framework for poisoning attacks against algorithmic fairness.
We develop a gradient-based poisoning attack aimed at introducing classification disparities among different groups in the data.
We believe that our findings pave the way towards the definition of an entirely novel set of adversarial attacks targeting algorithmic fairness in different scenarios.
arXiv Detail & Related papers (2020-04-15T08:07:01Z) - Feature Partitioning for Robust Tree Ensembles and their Certification
in Adversarial Scenarios [8.300942601020266]
We focus on evasion attacks, where a model is trained in a safe environment and exposed to attacks at test time.
We propose a model-agnostic strategy that builds a robust ensemble by training its basic models on feature-based partitions of the given dataset.
Our algorithm guarantees that the majority of the models in the ensemble cannot be affected by the attacker.
arXiv Detail & Related papers (2020-04-07T12:00:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.