Differentially-Private Decision Trees and Provable Robustness to Data
Poisoning
- URL: http://arxiv.org/abs/2305.15394v2
- Date: Thu, 12 Oct 2023 12:11:23 GMT
- Title: Differentially-Private Decision Trees and Provable Robustness to Data
Poisoning
- Authors: Dani\"el Vos, Jelle Vos, Tianyu Li, Zekeriya Erkin, Sicco Verwer
- Abstract summary: Decision trees are interpretable models that are well-suited to non-linear learning problems.
Current state-of-the-art algorithms for this purpose sacrifice much utility for a small privacy benefit.
We propose PrivaTree based on private histograms that chooses good splits while consuming a small privacy budget.
- Score: 8.649768969060647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decision trees are interpretable models that are well-suited to non-linear
learning problems. Much work has been done on extending decision tree learning
algorithms with differential privacy, a system that guarantees the privacy of
samples within the training data. However, current state-of-the-art algorithms
for this purpose sacrifice much utility for a small privacy benefit. These
solutions create random decision nodes that reduce decision tree accuracy or
spend an excessive share of the privacy budget on labeling leaves. Moreover,
many works do not support continuous features or leak information about them.
We propose a new method called PrivaTree based on private histograms that
chooses good splits while consuming a small privacy budget. The resulting trees
provide a significantly better privacy-utility trade-off and accept mixed
numerical and categorical data without leaking information about numerical
features. Finally, while it is notoriously hard to give robustness guarantees
against data poisoning attacks, we demonstrate bounds for the expected accuracy
and success rates of backdoor attacks against differentially-private learners.
By leveraging the better privacy-utility trade-off of PrivaTree we are able to
train decision trees with significantly better robustness against backdoor
attacks compared to regular decision trees and with meaningful theoretical
guarantees.
Related papers
- Blink: Link Local Differential Privacy in Graph Neural Networks via
Bayesian Estimation [79.64626707978418]
We propose using link local differential privacy over decentralized nodes to train graph neural networks.
Our approach spends the privacy budget separately on links and degrees of the graph for the server to better denoise the graph topology.
Our approach outperforms existing methods in terms of accuracy under varying privacy budgets.
arXiv Detail & Related papers (2023-09-06T17:53:31Z) - Algorithms with More Granular Differential Privacy Guarantees [65.3684804101664]
We consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis.
In this work, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person.
arXiv Detail & Related papers (2022-09-08T22:43:50Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Private Boosted Decision Trees via Smooth Re-Weighting [2.099922236065961]
Differential Privacy is the appropriate mathematical framework for formal guarantees of privacy.
We propose and test a practical algorithm for boosting decision trees that guarantees differential privacy.
arXiv Detail & Related papers (2022-01-29T20:08:52Z) - Fed-EINI: An Efficient and Interpretable Inference Framework for
Decision Tree Ensembles in Federated Learning [11.843365055516566]
Fed-EINI is an efficient and interpretable inference framework for federated decision tree models.
We propose to protect the decision path by the efficient additively homomorphic encryption method.
Experiments show that the inference efficiency is improved by over $50%$ in average.
arXiv Detail & Related papers (2021-05-20T06:40:05Z) - Scalable and Provably Accurate Algorithms for Differentially Private
Distributed Decision Tree Learning [34.79337646727395]
This paper introduces the first provably accurate algorithms for differentially private, top-down decision tree learning in the distributed setting.
We propose DP-TopDown, a general privacy preserving decision tree learning algorithm, and present two distributed implementations.
arXiv Detail & Related papers (2020-12-19T06:09:36Z) - Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions.
We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z) - Privacy Preserving Vertical Federated Learning for Tree-based Models [30.808567035503994]
Federated learning enables multiple organizations to jointly train a model without revealing their private data to each other.
We propose Pivot, a novel solution for privacy preserving vertical decision tree training and prediction.
arXiv Detail & Related papers (2020-08-14T02:32:36Z) - Balance is key: Private median splits yield high-utility random trees [4.90655427233754]
We propose DiPriMe forests, a novel tree-based ensemble method for differentially private regression and classification.
We show theoretically and empirically that the resulting algorithm exhibits high utility, while ensuring differential privacy.
arXiv Detail & Related papers (2020-06-15T21:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.