Timber! Poisoning Decision Trees
- URL: http://arxiv.org/abs/2410.00862v1
- Date: Tue, 1 Oct 2024 16:58:54 GMT
- Title: Timber! Poisoning Decision Trees
- Authors: Stefano Calzavara, Lorenzo Cazzaro, Massimo Vettori,
- Abstract summary: We present Timber, the first white-box poisoning attack targeting decision trees.
We show that our attacks outperform existing baselines in terms of effectiveness, efficiency or both.
We also show that two representative defenses can mitigate the effect of our attacks, but fail at effectively thwarting them.
- Score: 3.2034269519186656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Timber, the first white-box poisoning attack targeting decision trees. Timber is based on a greedy attack strategy leveraging sub-tree retraining to efficiently estimate the damage performed by poisoning a given training instance. The attack relies on a tree annotation procedure which enables sorting training instances so that they are processed in increasing order of computational cost of sub-tree retraining. This sorting yields a variant of Timber supporting an early stopping criterion designed to make poisoning attacks more efficient and feasible on larger datasets. We also discuss an extension of Timber to traditional random forest models, which is useful because decision trees are normally combined into ensembles to improve their predictive power. Our experimental evaluation on public datasets shows that our attacks outperform existing baselines in terms of effectiveness, efficiency or both. Moreover, we show that two representative defenses can mitigate the effect of our attacks, but fail at effectively thwarting them.
Related papers
- Can a Single Tree Outperform an Entire Forest? [5.448070998907116]
The prevailing mindset is that a single decision tree underperforms classic random forests in testing accuracy.
This study challenges such a mindset by significantly improving the testing accuracy of an oblique regression tree.
Our approach reformulates tree training as a differentiable unconstrained optimization task.
arXiv Detail & Related papers (2024-11-26T00:18:18Z) - Why do Random Forests Work? Understanding Tree Ensembles as
Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles.
We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z) - Automated generation of attack trees with optimal shape and labelling [0.9833293669382975]
This article addresses the problem of automatically generating attack trees that soundly and clearly describe the ways the system can be attacked.
We introduce an attack-tree generation algorithm that minimises the tree size and the information length of its labels without sacrificing correctness.
arXiv Detail & Related papers (2023-11-22T11:52:51Z) - Towards Reasonable Budget Allocation in Untargeted Graph Structure
Attacks via Gradient Debias [50.628150015907565]
Cross-entropy loss function is used to evaluate perturbation schemes in classification tasks.
Previous methods use negative cross-entropy loss as the attack objective in attacking node-level classification models.
This paper argues about the previous unreasonable attack objective from the perspective of budget allocation.
arXiv Detail & Related papers (2023-03-29T13:02:02Z) - On the Robustness of Random Forest Against Untargeted Data Poisoning: An
Ensemble-Based Approach [42.81632484264218]
In machine learning models, perturbations of fractions of the training set (poisoning) can seriously undermine the model accuracy.
This paper aims to implement a novel hash-based ensemble approach that protects random forest against untargeted, random poisoning attacks.
arXiv Detail & Related papers (2022-09-28T11:41:38Z) - Classification of Bark Beetle-Induced Forest Tree Mortality using Deep
Learning [7.032774322952993]
In this work, a deep learning based method is proposed to effectively classify different stages of bark beetle attacks at the individual tree level.
The proposed method uses RetinaNet architecture to train a shallow subnetwork for classifying the different attack stages of images captured by unmanned aerial vehicles (UAVs)
Experimental evaluations demonstrate the effectiveness of the proposed method by achieving an average accuracy of 98.95%, considerably outperforming the baseline method by approximately 10%.
arXiv Detail & Related papers (2022-07-15T00:16:25Z) - Poisoning Attack against Estimating from Pairwise Comparisons [140.9033911097995]
Attackers have strong motivation and incentives to manipulate the ranking list.
Data poisoning attacks on pairwise ranking algorithms can be formalized as the dynamic and static games between the ranker and the attacker.
We propose two efficient poisoning attack algorithms and establish the associated theoretical guarantees.
arXiv Detail & Related papers (2021-07-05T08:16:01Z) - Guided Adversarial Attack for Evaluating and Enhancing Adversarial
Defenses [59.58128343334556]
We introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training.
We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries.
We also propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses.
arXiv Detail & Related papers (2020-11-30T16:39:39Z) - An Efficient Adversarial Attack for Tree Ensembles [91.05779257472675]
adversarial attacks on tree based ensembles such as gradient boosting decision trees (DTs) and random forests (RFs)
We show that our method can be thousands of times faster than the previous mixed-integer linear programming (MILP) based approach.
Our code is available at https://chong-z/tree-ensemble-attack.
arXiv Detail & Related papers (2020-10-22T10:59:49Z) - Rectified Decision Trees: Exploring the Landscape of Interpretable and
Effective Machine Learning [66.01622034708319]
We propose a knowledge distillation based decision trees extension, dubbed rectified decision trees (ReDT)
We extend the splitting criteria and the ending condition of the standard decision trees, which allows training with soft labels.
We then train the ReDT based on the soft label distilled from a well-trained teacher model through a novel jackknife-based method.
arXiv Detail & Related papers (2020-08-21T10:45:25Z) - Optimal survival trees ensemble [0.0]
Recent studies have adopted an approach of selecting accurate and diverse trees based on individual or collective performance within an ensemble for classification and regression problems.
This work follows in the wake of these investigations and considers the possibility of growing a forest of optimal survival trees.
In addition to improve predictive performance, the proposed method reduces the number of survival trees in the ensemble as compared to the other tree based methods.
arXiv Detail & Related papers (2020-05-18T19:28:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.