Related papers: Federated Boosted Decision Trees with Differential Privacy

Federated Boosted Decision Trees with Differential Privacy

URL: http://arxiv.org/abs/2210.02910v1
Date: Thu, 6 Oct 2022 13:28:29 GMT
Title: Federated Boosted Decision Trees with Differential Privacy
Authors: Samuel Maddock, Graham Cormode, Tianhao Wang, Carsten Maple and Somesh Jha
Abstract summary: We propose a general framework that captures and extends existing approaches for differentially private decision trees. We show that with a careful choice of techniques it is possible to achieve very high utility while maintaining strong levels of privacy.
Score: 24.66980518231163
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: There is great demand for scalable, secure, and efficient privacy-preserving machine learning models that can be trained over distributed data. While deep learning models typically achieve the best results in a centralized non-secure setting, different models can excel when privacy and communication constraints are imposed. Instead, tree-based approaches such as XGBoost have attracted much attention for their high performance and ease of use; in particular, they often achieve state-of-the-art results on tabular data. Consequently, several recent works have focused on translating Gradient Boosted Decision Tree (GBDT) models like XGBoost into federated settings, via cryptographic mechanisms such as Homomorphic Encryption (HE) and Secure Multi-Party Computation (MPC). However, these do not always provide formal privacy guarantees, or consider the full range of hyperparameters and implementation settings. In this work, we implement the GBDT model under Differential Privacy (DP). We propose a general framework that captures and extends existing approaches for differentially private decision trees. Our framework of methods is tailored to the federated setting, and we show that with a careful choice of techniques it is possible to achieve very high utility while maintaining strong levels of privacy.

Related papers

Bilateral Differentially Private Vertical Federated Boosted Decision Trees [10.952674399412405]
Federated learning is a distributed machine learning paradigm that enables collaborative training across multiple parties while ensuring data privacy. In this paper, we propose a variant of vertical federated XGBoost with bilateral differential privacy guarantee: MaskedXGBoost. Our algorithm's superiority in both utility and efficiency has been validated on multiple datasets.
arXiv Detail & Related papers (2025-04-30T15:37:44Z)
Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines. We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z)
CorBin-FL: A Differentially Private Federated Learning Mechanism using Common Randomness [6.881974834597426]
Federated learning (FL) has emerged as a promising framework for distributed machine learning. We introduce CorBin-FL, a privacy mechanism that uses correlated binary quantization to achieve differential privacy. We also propose AugCorBin-FL, an extension that, in addition to PLDP, user-level and sample-level central differential privacy guarantees.
arXiv Detail & Related papers (2024-09-20T00:23:44Z)
Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit. We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z)
Differentially Private Fine-Tuning of Diffusion Models [22.454127503937883]
The integration of Differential Privacy with diffusion models (DMs) presents a promising yet challenging frontier. Recent developments in this field have highlighted the potential for generating high-quality synthetic data by pre-training on public data. We propose a strategy optimized for private diffusion models, which minimizes the number of trainable parameters to enhance the privacy-utility trade-off.
arXiv Detail & Related papers (2024-06-03T14:18:04Z)
Unified Mechanism-Specific Amplification by Subsampling and Group Privacy Amplification [54.1447806347273]
Amplification by subsampling is one of the main primitives in machine learning with differential privacy. We propose the first general framework for deriving mechanism-specific guarantees. We analyze how subsampling affects the privacy of groups of multiple users.
arXiv Detail & Related papers (2024-03-07T19:36:05Z)
PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind) Our work offers a theoretical analysis for model design and benchmarks various techniques. In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z)
Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD. We find that most examples enjoy stronger privacy guarantees than the worst-case bound. This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z)
Just Fine-tune Twice: Selective Differential Privacy for Large Language Models [69.66654761324702]
We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models. Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
arXiv Detail & Related papers (2022-04-15T22:36:55Z)
Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence [73.14373832423156]
We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy. Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
arXiv Detail & Related papers (2021-11-01T18:10:21Z)
Large-Scale Secure XGB for Vertical Federated Learning [15.864654742542246]
In this paper, we aim to build large-scale secure XGB under vertically federated learning setting. We employ secure multi-party computation techniques to avoid leaking intermediate information during training. By proposing secure permutation protocols, we can improve the training efficiency and make the framework scale to large dataset.
arXiv Detail & Related papers (2020-05-18T06:31:10Z)
FedSel: Federated SGD under Local Differential Privacy with Top-k Dimension Selection [26.54574385850849]
In this work, we propose a two-stage framework FedSel for federated SGD under LDP. Specifically, we propose three private dimension selection mechanisms and adapt the accumulation technique to stabilize the learning process with noisy updates. We also theoretically analyze privacy, accuracy and time complexity of FedSel, which outperforms the state-of-the-art solutions.
arXiv Detail & Related papers (2020-03-24T03:31:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.