Guard-GBDT: Efficient Privacy-Preserving Approximated GBDT Training on Vertical Dataset
- URL: http://arxiv.org/abs/2507.20688v1
- Date: Mon, 28 Jul 2025 10:16:37 GMT
- Title: Guard-GBDT: Efficient Privacy-Preserving Approximated GBDT Training on Vertical Dataset
- Authors: Anxiao Song, Shujie Cui, Jianli Bai, Ke Cheng, Yulong Shen, Giovanni Russello,
- Abstract summary: Guard-GBDT is an innovative framework tailored for efficient and privacy-preserving GBDT training on vertical datasets.<n>We implement a prototype of Guard-GBDT and extensively evaluate its performance and accuracy on various real-world datasets.
- Score: 15.175697228634979
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In light of increasing privacy concerns and stringent legal regulations, using secure multiparty computation (MPC) to enable collaborative GBDT model training among multiple data owners has garnered significant attention. Despite this, existing MPC-based GBDT frameworks face efficiency challenges due to high communication costs and the computation burden of non-linear operations, such as division and sigmoid calculations. In this work, we introduce Guard-GBDT, an innovative framework tailored for efficient and privacy-preserving GBDT training on vertical datasets. Guard-GBDT bypasses MPC-unfriendly division and sigmoid functions by using more streamlined approximations and reduces communication overhead by compressing the messages exchanged during gradient aggregation. We implement a prototype of Guard-GBDT and extensively evaluate its performance and accuracy on various real-world datasets. The results show that Guard-GBDT outperforms state-of-the-art HEP-XGB (CIKM'21) and SiGBDT (ASIA CCS'24) by up to $2.71\times$ and $12.21 \times$ on LAN network and up to $2.7\times$ and $8.2\times$ on WAN network. Guard-GBDT also achieves comparable accuracy with SiGBDT and plaintext XGBoost (better than HEP-XGB ), which exhibits a deviation of $\pm1\%$ to $\pm2\%$ only. Our implementation code is provided at https://github.com/XidianNSS/Guard-GBDT.git.
Related papers
- DP-CSGP: Differentially Private Stochastic Gradient Push with Compressed Communication [71.60998478544028]
We propose Differentially Private Gradient Push with Compressed communication (termedfrac-CSGP) for decentralized learning graphs.<n>For general non-math and smooth objective functions, we show that our algorithm is designed to maintain high accuracy and efficient communication.
arXiv Detail & Related papers (2025-12-15T17:37:02Z) - Breaking the Layer Barrier: Remodeling Private Transformer Inference with Hybrid CKKS and MPC [16.452180247201948]
This paper presents an efficient framework for private Transformer inference that combines Homomorphic Encryption (HE) and Secure Multi-party Computation (MPC) to protect data privacy.<n>The proposed framework, dubbed BLB, overcomes this by breaking down layers into fine-grained operators and further fusing adjacent linear operators, reducing the need for HE/MPC conversions.<n>BLB achieves a $21times$ reduction in communication overhead compared to BOLT (S&P'24) and a $2times$ reduction compared to Bumblebee (NDSS'25), along with latency reductions of $13times$ and $1.8
arXiv Detail & Related papers (2025-08-27T02:40:50Z) - Privacy-Preserving Inference for Quantized BERT Models [13.36359444231145]
Quantization offers a promising solution by converting floating-point operations into lower-precision integer computations.<n>We propose a fine-grained, layer-wise quantization scheme and support 1-bit weight fully connected layers in a secure setting.
arXiv Detail & Related papers (2025-08-03T07:52:08Z) - LGBQPC: Local Granular-Ball Quality Peaks Clustering [51.58924743533048]
The density peaks clustering (DPC) algorithm has attracted considerable attention for its ability to detect arbitrarily shaped clusters.<n>Recent advancements integrating granular-ball computing with DPC have led to the GB-based DPC algorithm, which improves computational efficiency.<n>This paper proposes the local GB quality peaks clustering (LGBQPC) algorithm, which offers comprehensive improvements to GBDPC in both GB generation and clustering processes.
arXiv Detail & Related papers (2025-05-16T15:26:02Z) - DPZV: Elevating the Tradeoff between Privacy and Utility in Zeroth-Order Vertical Federated Learning [9.302691218735406]
We propose DPZV, the first ZO optimization framework for Vertical Federated Learning (VFL) that achieves tunable differential privacy with performance guarantees.<n>We conduct a comprehensive theoretical analysis showing that DPZV matches the convergence rate of first-order optimization methods while satisfying formal ($epsilon, delta$)-DP guarantees.<n>Experiments on image and language benchmarks demonstrate that DPZV outperforms several baselines in terms of accuracy under a wide range of privacy constraints.
arXiv Detail & Related papers (2025-02-27T22:07:16Z) - Parallel Sequence Modeling via Generalized Spatial Propagation Network [80.66202109995726]
Generalized Spatial Propagation Network (GSPN) is a new attention mechanism for optimized vision tasks that inherently captures 2D spatial structures.<n>GSPN overcomes limitations by directly operating on spatially coherent image data and forming dense pairwise connections through a line-scan approach.<n>GSPN achieves superior spatial fidelity and state-of-the-art performance in vision tasks, including ImageNet classification, class-guided image generation, and text-to-image generation.
arXiv Detail & Related papers (2025-01-21T18:56:19Z) - Efficiently Achieving Secure Model Training and Secure Aggregation to Ensure Bidirectional Privacy-Preservation in Federated Learning [36.94596192980534]
Bidirectional privacy-preservation federated learning is crucial as both local gradients and the global model may leak privacy.<n>We design an efficient and high-accuracy bidirectional privacy-preserving scheme for federated learning to complete secure model training and secure aggregation.<n>Our scheme significantly outperforms state-of-the-art bidirectional privacy-preservation baselines in terms of computational cost, model accuracy, and defense ability.
arXiv Detail & Related papers (2024-12-16T12:58:21Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - Differentially Private Bias-Term Fine-tuning of Foundation Models [36.55810474925956]
We study the problem of differentially private (DP) fine-tuning of large pre-trained models.
We propose DP-BiTFiT, which matches the state-of-the-art accuracy for DP algorithms and the efficiency of the standard BiTFiT.
On a wide range of tasks, DP-BiTFiT is 230X faster and uses 28X less memory than DP full fine-tuning.
arXiv Detail & Related papers (2022-09-30T18:30:48Z) - Quantized Training of Gradient Boosting Decision Trees [84.97123593657584]
We propose to quantize all the high-precision gradients in a very simple yet effective way in the GBDT's training algorithm.
With low-precision gradients, most arithmetic operations in GBDT training can be replaced by integer operations of 8, 16, or 32 bits.
We observe up to 2$times$ speedup of our simple quantization strategy compared with SOTA GBDT systems on extensive datasets.
arXiv Detail & Related papers (2022-07-20T06:27:06Z) - Rethinking and Scaling Up Graph Contrastive Learning: An Extremely
Efficient Approach with Group Discrimination [87.07410882094966]
Graph contrastive learning (GCL) alleviates the heavy reliance on label information for graph representation learning (GRL)
We introduce a new learning paradigm for self-supervised GRL, namely, Group Discrimination (GD)
Instead of similarity computation, GGD directly discriminates two groups of summarised node instances with a simple binary cross-entropy loss.
In addition, GGD requires much fewer training epochs to obtain competitive performance compared with GCL methods on large-scale datasets.
arXiv Detail & Related papers (2022-06-03T12:32:47Z) - THE-X: Privacy-Preserving Transformer Inference with Homomorphic
Encryption [112.02441503951297]
Privacy-preserving inference of transformer models is on the demand of cloud service users.
We introduce $textitTHE-X$, an approximation approach for transformers, which enables privacy-preserving inference of pre-trained models.
arXiv Detail & Related papers (2022-06-01T03:49:18Z) - DTGAN: Differential Private Training for Tabular GANs [6.174448419090292]
We propose DTGAN, a novel conditional Wasserstein GAN that comes in two variants DTGAN_G and DTGAN_D.
We rigorously evaluate the theoretical privacy guarantees offered by DP empirically against membership and attribute inference attacks.
Our results on 3 datasets show that the DP-SGD framework is superior to PATE and that a DP discriminator is more optimal for training convergence.
arXiv Detail & Related papers (2021-07-06T10:28:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.