Related papers: Large-Scale Secure XGB for Vertical Federated Learning

Large-Scale Secure XGB for Vertical Federated Learning

URL: http://arxiv.org/abs/2005.08479v2
Date: Thu, 2 Sep 2021 04:11:18 GMT
Title: Large-Scale Secure XGB for Vertical Federated Learning
Authors: Wenjing Fang, Derun Zhao, Jin Tan, Chaochao Chen, Chaofan Yu, Li Wang, Lei Wang, Jun Zhou, Benyu Zhang
Abstract summary: In this paper, we aim to build large-scale secure XGB under vertically federated learning setting. We employ secure multi-party computation techniques to avoid leaking intermediate information during training. By proposing secure permutation protocols, we can improve the training efficiency and make the framework scale to large dataset.
Score: 15.864654742542246
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Privacy-preserving machine learning has drawn increasingly attention recently, especially with kinds of privacy regulations come into force. Under such situation, Federated Learning (FL) appears to facilitate privacy-preserving joint modeling among multiple parties. Although many federated algorithms have been extensively studied, there is still a lack of secure and practical gradient tree boosting models (e.g., XGB) in literature. In this paper, we aim to build large-scale secure XGB under vertically federated learning setting. We guarantee data privacy from three aspects. Specifically, (i) we employ secure multi-party computation techniques to avoid leaking intermediate information during training, (ii) we store the output model in a distributed manner in order to minimize information release, and (iii) we provide a novel algorithm for secure XGB predict with the distributed model. Furthermore, by proposing secure permutation protocols, we can improve the training efficiency and make the framework scale to large dataset. We conduct extensive experiments on both public datasets and real-world datasets, and the results demonstrate that our proposed XGB models provide not only competitive accuracy but also practical performance.

Related papers

Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner. Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z)
PriRoAgg: Achieving Robust Model Aggregation with Minimum Privacy Leakage for Federated Learning [49.916365792036636]
Federated learning (FL) has recently gained significant momentum due to its potential to leverage large-scale distributed user data. The transmitted model updates can potentially leak sensitive user information, and the lack of central control of the local training process leaves the global model susceptible to malicious manipulations on model updates. We develop a general framework PriRoAgg, utilizing Lagrange coded computing and distributed zero-knowledge proof, to execute a wide range of robust aggregation algorithms while satisfying aggregated privacy.
arXiv Detail & Related papers (2024-07-12T03:18:08Z)
FewFedPIT: Towards Privacy-preserving and Few-shot Federated Instruction Tuning [54.26614091429253]
Federated instruction tuning (FedIT) is a promising solution, by consolidating collaborative training across multiple data owners. FedIT encounters limitations such as scarcity of instructional data and risk of exposure to training data extraction attacks. We propose FewFedPIT, designed to simultaneously enhance privacy protection and model performance of federated few-shot learning.
arXiv Detail & Related papers (2024-03-10T08:41:22Z)
Independent Distribution Regularization for Private Graph Embedding [55.24441467292359]
Graph embeddings are susceptible to attribute inference attacks, which allow attackers to infer private node attributes from the learned graph embeddings. To address these concerns, privacy-preserving graph embedding methods have emerged. We propose a novel approach called Private Variational Graph AutoEncoders (PVGAE) with the aid of independent distribution penalty as a regularization term.
arXiv Detail & Related papers (2023-08-16T13:32:43Z)
Can Public Large Language Models Help Private Cross-device Federated Learning? [58.05449579773249]
We study (differentially) private federated learning (FL) of language models. Public data has been used to improve privacy-utility trade-offs for both large and small language models. We propose a novel distribution matching algorithm with theoretical grounding to sample public data close to private data distribution.
arXiv Detail & Related papers (2023-05-20T07:55:58Z)
Federated Boosted Decision Trees with Differential Privacy [24.66980518231163]
We propose a general framework that captures and extends existing approaches for differentially private decision trees. We show that with a careful choice of techniques it is possible to achieve very high utility while maintaining strong levels of privacy.
arXiv Detail & Related papers (2022-10-06T13:28:29Z)
sqSGD: Locally Private and Communication Efficient Federated Learning [14.60645909629309]
Federated learning (FL) is a technique that trains machine learning models from decentralized data sources. We develop a gradient-based learning algorithm called sqSGD that addresses communication efficiency and high-dimensional compatibility. Experiment results show sqSGD successfully learns large models like LeNet and ResNet with local privacy constraints.
arXiv Detail & Related papers (2022-06-21T17:45:35Z)
Efficient Logistic Regression with Local Differential Privacy [0.0]
Internet of Things devices are expanding rapidly and generating huge amount of data. There is an increasing need to explore data collected from these devices. Collaborative learning provides a strategic solution for the Internet of Things settings but also raises public concern over data privacy.
arXiv Detail & Related papers (2022-02-05T22:44:03Z)
An Efficient Learning Framework For Federated XGBoost Using Secret Sharing And Distributed Optimization [47.70500612425959]
XGBoost is one of the most widely used machine learning models in the industry due to its superior learning accuracy and efficiency. It is crucial to deploy a secure and efficient federated XGBoost (FedXGB) model to tackle data isolation issues in the big data problems. In this paper, a multi-party federated XGB learning framework is proposed with a security guarantee, which reshapes the XGBoost's split criterion calculation process under a secret sharing setting. Remarkably, a thorough analysis of model security is provided as well, and multiple numerical results showcase the superiority of the proposed FedXGB
arXiv Detail & Related papers (2021-05-12T15:04:18Z)
GRAFFL: Gradient-free Federated Learning of a Bayesian Generative Model [8.87104231451079]
This paper presents the first gradient-free federated learning framework called GRAFFL. It uses implicit information derived from each participating institution to learn posterior distributions of parameters. We propose the GRAFFL-based Bayesian mixture model to serve as a proof-of-concept of the framework.
arXiv Detail & Related papers (2020-08-29T07:19:44Z)
LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy [20.95527613004989]
Federated learning is a popular approach for privacy protection that collects the local gradient information instead of real data. Previous works do not give a practical solution due to three issues. Last, the privacy budget explodes due to the high dimensionality of weights in deep learning models.
arXiv Detail & Related papers (2020-07-31T01:08:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.