Related papers: Efficient Batch Homomorphic Encryption for Vertically Federated XGBoost

Efficient Batch Homomorphic Encryption for Vertically Federated XGBoost

URL: http://arxiv.org/abs/2112.04261v1
Date: Wed, 8 Dec 2021 12:41:01 GMT
Title: Efficient Batch Homomorphic Encryption for Vertically Federated XGBoost
Authors: Wuxing Xu, Hao Fan, Kaixin Li, Kai Yang
Abstract summary: In this paper, we study the efficiency problem of adapting widely used XGBoost model in real-world applications to vertical federated learning setting. We propose a novel batch homomorphic encryption method to cut the cost of encryption-related and transmission in nearly half.
Score: 9.442606239058806
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: More and more orgainizations and institutions make efforts on using external data to improve the performance of AI services. To address the data privacy and security concerns, federated learning has attracted increasing attention from both academia and industry to securely construct AI models across multiple isolated data providers. In this paper, we studied the efficiency problem of adapting widely used XGBoost model in real-world applications to vertical federated learning setting. State-of-the-art vertical federated XGBoost frameworks requires large number of encryption operations and ciphertext transmissions, which makes the model training much less efficient than training XGBoost models locally. To bridge this gap, we proposed a novel batch homomorphic encryption method to cut the cost of encryption-related computation and transmission in nearly half. This is achieved by encoding the first-order derivative and the second-order derivative into a single number for encryption, ciphertext transmission, and homomorphic addition operations. The sum of multiple first-order derivatives and second-order derivatives can be simultaneously decoded from the sum of encoded values. We are motivated by the batch idea in the work of BatchCrypt for horizontal federated learning, and design a novel batch method to address the limitations of allowing quite few number of negative numbers. The encode procedure of the proposed batch method consists of four steps, including shifting, truncating, quantizing and batching, while the decoding procedure consists of de-quantization and shifting back. The advantages of our method are demonstrated through theoretical analysis and extensive numerical experiments.

Related papers

Bilateral Differentially Private Vertical Federated Boosted Decision Trees [10.952674399412405]
Federated learning is a distributed machine learning paradigm that enables collaborative training across multiple parties while ensuring data privacy. In this paper, we propose a variant of vertical federated XGBoost with bilateral differential privacy guarantee: MaskedXGBoost. Our algorithm's superiority in both utility and efficiency has been validated on multiple datasets.
arXiv Detail & Related papers (2025-04-30T15:37:44Z)
A Theoretical Perspective for Speculative Decoding Algorithm [60.79447486066416]
One effective way to accelerate inference is emphSpeculative Decoding, which employs a small model to sample a sequence of draft tokens and a large model to validate. This paper tackles this gap by conceptualizing the decoding problem via markov chain abstraction and studying the key properties, emphoutput quality and inference acceleration, from a theoretical perspective.
arXiv Detail & Related papers (2024-10-30T01:53:04Z)
CryptoTrain: Fast Secure Training on Encrypted Dataset [17.23344104239024]
We develop a hybrid cryptographic protocol that merges Homomorphic Encryption with Oblivious Transfer (OT) for handling linear and non-linear operations. By integrating CCMul-Precompute and correlated convolution into CryptoTrain-B, we facilitate a rapid and efficient secure training framework.
arXiv Detail & Related papers (2024-09-25T07:06:14Z)
HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data [60.75578581719921]
Large language models (LLMs) have shown great potential for automatic code generation. Recent studies highlight that many LLM-generated code contains serious security vulnerabilities. We introduce HexaCoder, a novel approach to enhance the ability of LLMs to generate secure codes.
arXiv Detail & Related papers (2024-09-10T12:01:43Z)
Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion [59.17158389902231]
Speculative decoding has emerged as a widely adopted method to accelerate large language model inference. This paper proposes an adaptation of speculative decoding which uses discrete diffusion models to generate draft sequences.
arXiv Detail & Related papers (2024-08-10T21:24:25Z)
FLUE: Federated Learning with Un-Encrypted model weights [0.0]
Federated learning enables devices to collaboratively train a shared model while keeping training data locally stored. Recent research emphasizes using encrypted model parameters during training. This paper introduces a novel federated learning algorithm, leveraging coded local gradients without encryption.
arXiv Detail & Related papers (2024-07-26T14:04:57Z)
Factor Graph Optimization of Error-Correcting Codes for Belief Propagation Decoding [62.25533750469467]
Low-Density Parity-Check (LDPC) codes possess several advantages over other families of codes. The proposed approach is shown to outperform the decoding performance of existing popular codes by orders of magnitude.
arXiv Detail & Related papers (2024-06-09T12:08:56Z)
Learning Linear Block Error Correction Codes [62.25533750469467]
We propose for the first time a unified encoder-decoder training of binary linear block codes. We also propose a novel Transformer model in which the self-attention masking is performed in a differentiable fashion for the efficient backpropagation of the code gradient.
arXiv Detail & Related papers (2024-05-07T06:47:12Z)
Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration [54.897493351694195]
We propose a novel parallel decoding approach, namely textithidden transfer, which decodes multiple successive tokens simultaneously in a single forward pass. In terms of acceleration metrics, we outperform all the single-model acceleration techniques, including Medusa and Self-Speculative decoding.
arXiv Detail & Related papers (2024-04-18T09:17:06Z)
Efficient Encoder-Decoder Transformer Decoding for Decomposable Tasks [53.550782959908524]
We introduce a new configuration for encoder-decoder models that improves efficiency on structured output and decomposable tasks. Our method, prompt-in-decoder (PiD), encodes the input once and decodes the output in parallel, boosting both training and inference efficiency.
arXiv Detail & Related papers (2024-03-19T19:27:23Z)
Encrypted Dynamic Control exploiting Limited Number of Multiplications and a Method using RLWE-based Cryptosystem [0.3749861135832073]
We present a method to encrypt dynamic controllers that can be implemented through most homomorphic encryption schemes. As a result, the encrypted controller involves only a limited number of homomorphic multiplications on every encrypted data. We propose a customization of the method for Ring Learning With Errors (RLWE)-based cryptosystems, where a vector of messages can be encrypted into a single ciphertext.
arXiv Detail & Related papers (2023-07-07T08:24:48Z)
FFConv: Fast Factorized Neural Network Inference on Encrypted Data [9.868787266501036]
We propose a low-rank factorization method called FFConv to unify convolution and ciphertext packing. Compared to prior art LoLa and Falcon, our method reduces the inference latency by up to 87% and 12%, respectively.
arXiv Detail & Related papers (2021-02-06T03:10:13Z)
TEDL: A Text Encryption Method Based on Deep Learning [10.428079716944463]
This paper proposes a novel text encryption method based on deep learning called TEDL. Results of experiments and relevant analyses show that TEDL performs well for security, efficiency, generality, and has a lower demand for the frequency of key redistribution.
arXiv Detail & Related papers (2020-03-09T11:04:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.