Secure Federated XGBoost with CUDA-accelerated Homomorphic Encryption via NVIDIA FLARE
- URL: http://arxiv.org/abs/2504.03909v1
- Date: Fri, 04 Apr 2025 20:08:24 GMT
- Title: Secure Federated XGBoost with CUDA-accelerated Homomorphic Encryption via NVIDIA FLARE
- Authors: Ziyue Xu, Yuan-Ting Hsieh, Zhihong Zhang, Holger R. Roth, Chester Chen, Yan Cheng, Andrew Feng,
- Abstract summary: Federated learning (FL) enables collaborative model training across decentralized datasets.<n> NVIDIA FLARE's Federated XGBoost extends the popular XGBoost algorithm to both vertical and horizontal federated settings.<n>Initial implementation assumed mutual trust over the sharing of intermediate statistics.<n>We introduce "Secure Federated XGBoost", an efficient solution to mitigate these risks.
- Score: 6.053716038605071
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning (FL) enables collaborative model training across decentralized datasets. NVIDIA FLARE's Federated XGBoost extends the popular XGBoost algorithm to both vertical and horizontal federated settings, facilitating joint model development without direct data sharing. However, the initial implementation assumed mutual trust over the sharing of intermediate gradient statistics produced by the XGBoost algorithm, leaving potential vulnerabilities to honest-but-curious adversaries. This work introduces "Secure Federated XGBoost", an efficient solution to mitigate these risks. We implement secure federated algorithms for both vertical and horizontal scenarios, addressing diverse data security patterns. To secure the messages, we leverage homomorphic encryption (HE) to protect sensitive information during training. A novel plugin and processor interface seamlessly integrates HE into the Federated XGBoost pipeline, enabling secure aggregation over ciphertexts. We present both CPU-based and CUDA-accelerated HE plugins, demonstrating significant performance gains. Notably, our CUDA-accelerated HE implementation achieves up to 30x speedups in vertical Federated XGBoost compared to existing third-party solutions. By securing critical computation steps and encrypting sensitive assets, Secure Federated XGBoost provides robust data privacy guarantees, reinforcing the fundamental benefits of federated learning while maintaining high performance.
Related papers
- Bilateral Differentially Private Vertical Federated Boosted Decision Trees [10.952674399412405]
Federated learning is a distributed machine learning paradigm that enables collaborative training across multiple parties while ensuring data privacy.
In this paper, we propose a variant of vertical federated XGBoost with bilateral differential privacy guarantee: MaskedXGBoost.
Our algorithm's superiority in both utility and efficiency has been validated on multiple datasets.
arXiv Detail & Related papers (2025-04-30T15:37:44Z) - Federated Instruction Tuning of LLMs with Domain Coverage Augmentation [87.49293964617128]
Federated Domain-specific Instruction Tuning (FedDIT) utilizes limited cross-client private data together with various strategies of instruction augmentation.<n>We propose FedDCA, which optimize domain coverage through greedy client center selection and retrieval-based augmentation.<n>For client-side computational efficiency and system scalability, FedDCA$*$, the variant of FedDCA, utilizes heterogeneous encoders with server-side feature alignment.
arXiv Detail & Related papers (2024-09-30T09:34:31Z) - EncCluster: Scalable Functional Encryption in Federated Learning through Weight Clustering and Probabilistic Filters [3.9660142560142067]
Federated Learning (FL) enables model training across decentralized devices by communicating solely local model updates to an aggregation server.
FL remains vulnerable to inference attacks during model update transmissions.
We present EncCluster, a novel method that integrates model compression through weight clustering with recent decentralized FE and privacy-enhancing data encoding.
arXiv Detail & Related papers (2024-06-13T14:16:50Z) - SOCI^+: An Enhanced Toolkit for Secure OutsourcedComputation on Integers [50.608828039206365]
We propose SOCI+ which significantly improves the performance of SOCI.
SOCI+ employs a novel (2, 2)-threshold Paillier cryptosystem with fast encryption and decryption as its cryptographic primitive.
Compared with SOCI, our experimental evaluation shows that SOCI+ is up to 5.4 times more efficient in computation and 40% less in communication overhead.
arXiv Detail & Related papers (2023-09-27T05:19:32Z) - GIFD: A Generative Gradient Inversion Method with Feature Domain
Optimization [52.55628139825667]
Federated Learning (FL) has emerged as a promising distributed machine learning framework to preserve clients' privacy.
Recent studies find that an attacker can invert the shared gradients and recover sensitive data against an FL system by leveraging pre-trained generative adversarial networks (GAN) as prior knowledge.
We propose textbfGradient textbfInversion over textbfFeature textbfDomains (GIFD), which disassembles the GAN model and searches the feature domains of the intermediate layers.
arXiv Detail & Related papers (2023-08-09T04:34:21Z) - Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive
Privacy Analysis and Beyond [57.10914865054868]
We consider vertical logistic regression (VLR) trained with mini-batch descent gradient.
We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks.
arXiv Detail & Related papers (2022-07-19T05:47:30Z) - FedXGBoost: Privacy-Preserving XGBoost for Federated Learning [10.304484601250948]
Federated learning is the distributed machine learning framework that enables collaborative training across multiple parties while ensuring data privacy.
We propose two variants of federated XGBoost with privacy guarantee: FedXGBoost-SMM and FedXGBoost-LDP.
arXiv Detail & Related papers (2021-06-20T09:17:45Z) - An Efficient Learning Framework For Federated XGBoost Using Secret
Sharing And Distributed Optimization [47.70500612425959]
XGBoost is one of the most widely used machine learning models in the industry due to its superior learning accuracy and efficiency.
It is crucial to deploy a secure and efficient federated XGBoost (FedXGB) model to tackle data isolation issues in the big data problems.
In this paper, a multi-party federated XGB learning framework is proposed with a security guarantee, which reshapes the XGBoost's split criterion calculation process under a secret sharing setting.
Remarkably, a thorough analysis of model security is provided as well, and multiple numerical results showcase the superiority of the proposed FedXGB
arXiv Detail & Related papers (2021-05-12T15:04:18Z) - Secure Bilevel Asynchronous Vertical Federated Learning with Backward
Updating [159.48259714642447]
Vertical scalable learning (VFL) attracts increasing attention due to the demands of multi-party collaborative modeling and concerns of privacy leakage.
We propose a novel bftextlevel parallel architecture (VF$bfB2$), under which three new algorithms, including VF$B2$, are proposed.
arXiv Detail & Related papers (2021-03-01T12:34:53Z) - FedBoosting: Federated Learning with Gradient Protected Boosting for
Text Recognition [7.988454173034258]
Federated Learning (FL) framework allows learning a shared model collaboratively without data being centralized or shared among data owners.
We show in this paper that the generalization ability of the joint model is poor on Non-Independent and Non-Identically Distributed (Non-IID) data.
We propose a novel boosting algorithm for FL to address both the generalization and gradient leakage issues.
arXiv Detail & Related papers (2020-07-14T18:47:23Z) - Cloud-based Federated Boosting for Mobile Crowdsensing [29.546495197035366]
We propose a secret sharing based federated learning architecture FedXGB to achieve the privacy-preserving extreme gradient boosting for mobile crowdsensing.
Specifically, we first build a secure classification and regression tree (CART) of XGBoost using secret sharing.
Then, we propose a secure prediction protocol to protect the model privacy of XGBoost in mobile crowdsensing.
arXiv Detail & Related papers (2020-05-09T08:49:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.