Federated Boosted Decision Trees with Differential Privacy
- URL: http://arxiv.org/abs/2210.02910v1
- Date: Thu, 6 Oct 2022 13:28:29 GMT
- Title: Federated Boosted Decision Trees with Differential Privacy
- Authors: Samuel Maddock, Graham Cormode, Tianhao Wang, Carsten Maple and Somesh
Jha
- Abstract summary: We propose a general framework that captures and extends existing approaches for differentially private decision trees.
We show that with a careful choice of techniques it is possible to achieve very high utility while maintaining strong levels of privacy.
- Score: 24.66980518231163
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is great demand for scalable, secure, and efficient privacy-preserving
machine learning models that can be trained over distributed data. While deep
learning models typically achieve the best results in a centralized non-secure
setting, different models can excel when privacy and communication constraints
are imposed. Instead, tree-based approaches such as XGBoost have attracted much
attention for their high performance and ease of use; in particular, they often
achieve state-of-the-art results on tabular data. Consequently, several recent
works have focused on translating Gradient Boosted Decision Tree (GBDT) models
like XGBoost into federated settings, via cryptographic mechanisms such as
Homomorphic Encryption (HE) and Secure Multi-Party Computation (MPC). However,
these do not always provide formal privacy guarantees, or consider the full
range of hyperparameters and implementation settings. In this work, we
implement the GBDT model under Differential Privacy (DP). We propose a general
framework that captures and extends existing approaches for differentially
private decision trees. Our framework of methods is tailored to the federated
setting, and we show that with a careful choice of techniques it is possible to
achieve very high utility while maintaining strong levels of privacy.
Related papers
- CorBin-FL: A Differentially Private Federated Learning Mechanism using Common Randomness [6.881974834597426]
Federated learning (FL) has emerged as a promising framework for distributed machine learning.
We introduce CorBin-FL, a privacy mechanism that uses correlated binary quantization to achieve differential privacy.
We also propose AugCorBin-FL, an extension that, in addition to PLDP, user-level and sample-level central differential privacy guarantees.
arXiv Detail & Related papers (2024-09-20T00:23:44Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Differentially Private Fine-Tuning of Diffusion Models [22.454127503937883]
The integration of Differential Privacy with diffusion models (DMs) presents a promising yet challenging frontier.
Recent developments in this field have highlighted the potential for generating high-quality synthetic data by pre-training on public data.
We propose a strategy optimized for private diffusion models, which minimizes the number of trainable parameters to enhance the privacy-utility trade-off.
arXiv Detail & Related papers (2024-06-03T14:18:04Z) - Unified Mechanism-Specific Amplification by Subsampling and Group Privacy Amplification [54.1447806347273]
Amplification by subsampling is one of the main primitives in machine learning with differential privacy.
We propose the first general framework for deriving mechanism-specific guarantees.
We analyze how subsampling affects the privacy of groups of multiple users.
arXiv Detail & Related papers (2024-03-07T19:36:05Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Just Fine-tune Twice: Selective Differential Privacy for Large Language
Models [69.66654761324702]
We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models.
Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
arXiv Detail & Related papers (2022-04-15T22:36:55Z) - Don't Generate Me: Training Differentially Private Generative Models
with Sinkhorn Divergence [73.14373832423156]
We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy.
Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
arXiv Detail & Related papers (2021-11-01T18:10:21Z) - Large-Scale Secure XGB for Vertical Federated Learning [15.864654742542246]
In this paper, we aim to build large-scale secure XGB under vertically federated learning setting.
We employ secure multi-party computation techniques to avoid leaking intermediate information during training.
By proposing secure permutation protocols, we can improve the training efficiency and make the framework scale to large dataset.
arXiv Detail & Related papers (2020-05-18T06:31:10Z) - FedSel: Federated SGD under Local Differential Privacy with Top-k
Dimension Selection [26.54574385850849]
In this work, we propose a two-stage framework FedSel for federated SGD under LDP.
Specifically, we propose three private dimension selection mechanisms and adapt the accumulation technique to stabilize the learning process with noisy updates.
We also theoretically analyze privacy, accuracy and time complexity of FedSel, which outperforms the state-of-the-art solutions.
arXiv Detail & Related papers (2020-03-24T03:31:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.