Related papers: An Efficient Learning Framework For Federated XGBoost Using Secret Sharing And Distributed Optimization

An Efficient Learning Framework For Federated XGBoost Using Secret Sharing And Distributed Optimization

URL: http://arxiv.org/abs/2105.05717v1
Date: Wed, 12 May 2021 15:04:18 GMT
Title: An Efficient Learning Framework For Federated XGBoost Using Secret Sharing And Distributed Optimization
Authors: Lunchen Xie, Jiaqi Liu, Songtao Lu, Tsung-hui Chang, Qingjiang Shi
Abstract summary: XGBoost is one of the most widely used machine learning models in the industry due to its superior learning accuracy and efficiency. It is crucial to deploy a secure and efficient federated XGBoost (FedXGB) model to tackle data isolation issues in the big data problems. In this paper, a multi-party federated XGB learning framework is proposed with a security guarantee, which reshapes the XGBoost's split criterion calculation process under a secret sharing setting. Remarkably, a thorough analysis of model security is provided as well, and multiple numerical results showcase the superiority of the proposed FedXGB
Score: 47.70500612425959
License: http://creativecommons.org/licenses/by/4.0/
Abstract: XGBoost is one of the most widely used machine learning models in the industry due to its superior learning accuracy and efficiency. Targeting at data isolation issues in the big data problems, it is crucial to deploy a secure and efficient federated XGBoost (FedXGB) model. Existing FedXGB models either have data leakage issues or are only applicable to the two-party setting with heavy communication and computation overheads. In this paper, a lossless multi-party federated XGB learning framework is proposed with a security guarantee, which reshapes the XGBoost's split criterion calculation process under a secret sharing setting and solves the leaf weight calculation problem by leveraging distributed optimization. Remarkably, a thorough analysis of model security is provided as well, and multiple numerical results showcase the superiority of the proposed FedXGB compared with the state-of-the-art models on benchmark datasets.

Related papers

AdaDeDup: Adaptive Hybrid Data Pruning for Efficient Large-Scale Object Detection Training [33.01500681857408]
We introduce Adaptive De-Duplication (AdaDeDup), a novel framework that integrates density-based pruning with model-informed feedback in a cluster-adaptive manner.<n>It significantly outperforms prominent baselines, substantially reduces performance degradation, and achieves near-original model performance while pruning 20% of data.
arXiv Detail & Related papers (2025-06-24T22:35:51Z)
Bilateral Differentially Private Vertical Federated Boosted Decision Trees [10.952674399412405]
Federated learning is a distributed machine learning paradigm that enables collaborative training across multiple parties while ensuring data privacy. In this paper, we propose a variant of vertical federated XGBoost with bilateral differential privacy guarantee: MaskedXGBoost. Our algorithm's superiority in both utility and efficiency has been validated on multiple datasets.
arXiv Detail & Related papers (2025-04-30T15:37:44Z)
Combatting Dimensional Collapse in LLM Pre-Training Data via Diversified File Selection [65.96556073745197]
DiverSified File selection algorithm (DiSF) is proposed to select the most decorrelated text files in the feature space. DiSF saves 98.5% of 590M training files in SlimPajama, outperforming the full-data pre-training within a 50B training budget.
arXiv Detail & Related papers (2025-04-29T11:13:18Z)
Secure Federated XGBoost with CUDA-accelerated Homomorphic Encryption via NVIDIA FLARE [6.053716038605071]
Federated learning (FL) enables collaborative model training across decentralized datasets. NVIDIA FLARE's Federated XGBoost extends the popular XGBoost algorithm to both vertical and horizontal federated settings. Initial implementation assumed mutual trust over the sharing of intermediate statistics. We introduce "Secure Federated XGBoost", an efficient solution to mitigate these risks.
arXiv Detail & Related papers (2025-04-04T20:08:24Z)
FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors [50.131271229165165]
Federated Learning (FL) has emerged as a promising framework for distributed machine learning. Data heterogeneity resulting from differences across user behaviors, preferences, and device characteristics poses a significant challenge for federated learning. We propose Adaptive Weight Aggregation (FedAWA), a novel method that adaptively adjusts aggregation weights based on client vectors during the learning process.
arXiv Detail & Related papers (2025-03-20T04:49:40Z)
Fed-pilot: Optimizing LoRA Allocation for Efficient Federated Fine-Tuning with Heterogeneous Clients [11.102441622530181]
We propose Fed-pilot, a memory-efficient federated fine-tuning framework.<n>It enables memory-constrained clients to participate in Low-Rank Adaptation (LoRA)-based fine-tuning by training only a subset of LoRA modules locally.<n>To the best of our knowledge, this is the first study on federated fine-tuning of FMs that integrates memory-constrained optimization.
arXiv Detail & Related papers (2024-10-14T06:36:41Z)
A robust three-way classifier with shadowed granular-balls based on justifiable granularity [53.39844791923145]
We construct a robust three-way classifier with shadowed GBs for uncertain data. Our model demonstrates in managing uncertain data and effectively mitigates classification risks.
arXiv Detail & Related papers (2024-07-03T08:54:45Z)
xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token [108.7069350303884]
xRAG is an innovative context compression method tailored for retrieval-augmented generation. xRAG seamlessly integrates document embeddings into the language model representation space. Experimental results demonstrate that xRAG achieves an average improvement of over 10% across six knowledge-intensive tasks.
arXiv Detail & Related papers (2024-05-22T16:15:17Z)
Parameter-Efficient Instruction Tuning of Large Language Models For Extreme Financial Numeral Labelling [29.84946857859386]
We study the problem of automatically annotating relevant numerals occurring in the financial documents with their corresponding tags. We propose a parameter efficient solution for the task using LoRA. Our proposed model, FLAN-FinXC, achieves new state-of-the-art performances on both the datasets.
arXiv Detail & Related papers (2024-05-03T16:41:36Z)
Fake It Till Make It: Federated Learning with Consensus-Oriented Generation [52.82176415223988]
We propose federated learning with consensus-oriented generation (FedCOG) FedCOG consists of two key components at the client side: complementary data generation and knowledge-distillation-based model training. Experiments on classical and real-world FL datasets show that FedCOG consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T18:49:59Z)
FedDisco: Federated Learning with Discrepancy-Aware Collaboration [41.828780724903744]
We propose a novel aggregation method, Federated Learning with Discrepancy-aware Collaboration (FedDisco) Our FedDisco outperforms several state-of-the-art methods and can be easily incorporated with many existing methods to further enhance the performance.
arXiv Detail & Related papers (2023-05-30T17:20:51Z)
Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks. We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z)
FedSDG-FS: Efficient and Secure Feature Selection for Vertical Federated Learning [21.79965380400454]
Vertical Learning (VFL) enables multiple data owners, each holding a different subset of features about largely overlapping sets of data sample(s) to jointly train a useful global model. Feature selection (FS) is important to VFL. It is still an open research problem as existing FS works designed for VFL either assumes prior knowledge on the number of noisy features or prior knowledge on the post-training threshold of useful features. We propose the Federated Dual-Gate based Feature Selection (FedSDG-FS) approach. It consists of a Gaussian dual-gate to efficiently approximate the probability of a feature being selected, with privacy
arXiv Detail & Related papers (2023-02-21T03:09:45Z)
FedADMM: A Robust Federated Deep Learning Framework with Adaptivity to System Heterogeneity [4.2059108111562935]
Federated Learning (FL) is an emerging framework for distributed processing of large data volumes by edge devices. In this paper, we introduce a new FLAD FedADMM based protocol. We show that FedADMM consistently outperforms all baseline methods in terms of communication efficiency.
arXiv Detail & Related papers (2022-04-07T15:58:33Z)
Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models. We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs. Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z)
Large-Scale Secure XGB for Vertical Federated Learning [15.864654742542246]
In this paper, we aim to build large-scale secure XGB under vertically federated learning setting. We employ secure multi-party computation techniques to avoid leaking intermediate information during training. By proposing secure permutation protocols, we can improve the training efficiency and make the framework scale to large dataset.
arXiv Detail & Related papers (2020-05-18T06:31:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.