Privacy-Preserving XGBoost Inference
- URL: http://arxiv.org/abs/2011.04789v4
- Date: Tue, 24 Nov 2020 18:07:27 GMT
- Title: Privacy-Preserving XGBoost Inference
- Authors: Xianrui Meng, Joan Feigenbaum
- Abstract summary: A major barrier to adoption is the sensitive nature of predictive queries.
One central goal of privacy-preserving machine learning (PPML) is to enable users to submit encrypted queries to a remote ML service.
We propose a privacy-preserving XGBoost prediction algorithm, which we have implemented and evaluated empirically on AWS SageMaker.
- Score: 0.6345523830122165
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although machine learning (ML) is widely used for predictive tasks, there are
important scenarios in which ML cannot be used or at least cannot achieve its
full potential. A major barrier to adoption is the sensitive nature of
predictive queries. Individual users may lack sufficiently rich datasets to
train accurate models locally but also be unwilling to send sensitive queries
to commercial services that vend such models. One central goal of
privacy-preserving machine learning (PPML) is to enable users to submit
encrypted queries to a remote ML service, receive encrypted results, and
decrypt them locally. We aim at developing practical solutions for real-world
privacy-preserving ML inference problems. In this paper, we propose a
privacy-preserving XGBoost prediction algorithm, which we have implemented and
evaluated empirically on AWS SageMaker. Experimental results indicate that our
algorithm is efficient enough to be used in real ML production environments.
Related papers
- SVIP: Towards Verifiable Inference of Open-source Large Language Models [33.910670775972335]
Open-source Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language understanding and generation, leading to widespread adoption across various domains.
Their increasing model sizes render local deployment impractical for individual users, pushing many to rely on computing service providers for inference through a blackbox API.
This reliance introduces a new risk: a computing provider may stealthily substitute the requested LLM with a smaller, less capable model without consent from users, thereby delivering inferior outputs while benefiting from cost savings.
arXiv Detail & Related papers (2024-10-29T17:52:45Z) - MUSE: Machine Unlearning Six-Way Evaluation for Language Models [109.76505405962783]
Language models (LMs) are trained on vast amounts of text data, which may include private and copyrighted content.
We propose MUSE, a comprehensive machine unlearning evaluation benchmark.
We benchmark how effectively eight popular unlearning algorithms can unlearn Harry Potter books and news articles.
arXiv Detail & Related papers (2024-07-08T23:47:29Z) - LLM-Select: Feature Selection with Large Language Models [64.5099482021597]
Large language models (LLMs) are capable of selecting the most predictive features, with performance rivaling the standard tools of data science.
Our findings suggest that LLMs may be useful not only for selecting the best features for training but also for deciding which features to collect in the first place.
arXiv Detail & Related papers (2024-07-02T22:23:40Z) - Large Language Models Must Be Taught to Know What They Don't Know [97.90008709512921]
We show that fine-tuning on a small dataset of correct and incorrect answers can create an uncertainty estimate with good generalization and small computational overhead.
We also investigate the mechanisms that enable reliable uncertainty estimation, finding that many models can be used as general-purpose uncertainty estimators.
arXiv Detail & Related papers (2024-06-12T16:41:31Z) - Wildest Dreams: Reproducible Research in Privacy-preserving Neural
Network Training [2.853180143237022]
This work focuses on the ML model's training phase, where maintaining user data privacy is of utmost importance.
We provide a solid theoretical background that eases the understanding of current approaches.
We reproduce results for some of the papers and examine at what level existing works in the field provide support for open science.
arXiv Detail & Related papers (2024-03-06T10:25:36Z) - GuardML: Efficient Privacy-Preserving Machine Learning Services Through
Hybrid Homomorphic Encryption [2.611778281107039]
Privacy-Preserving Machine Learning (PPML) methods have been introduced to safeguard the privacy and security of Machine Learning models.
Modern cryptographic scheme, Hybrid Homomorphic Encryption (HHE) has recently emerged.
We develop and evaluate an HHE-based PPML application for classifying heart disease based on sensitive ECG data.
arXiv Detail & Related papers (2024-01-26T13:12:52Z) - HE-MAN -- Homomorphically Encrypted MAchine learning with oNnx models [0.23624125155742057]
homomorphic encryption (FHE) is a promising technique to enable individuals using ML services without giving up privacy.
We introduce HE-MAN, an open-source machine learning toolset for privacy preserving inference with ONNX models and homomorphically encrypted data.
Compared to prior work, HE-MAN supports a broad range of ML models in ONNX format out of the box without sacrificing accuracy.
arXiv Detail & Related papers (2023-02-16T12:37:14Z) - Distributed Machine Learning and the Semblance of Trust [66.1227776348216]
Federated Learning (FL) allows the data owner to maintain data governance and perform model training locally without having to share their data.
FL and related techniques are often described as privacy-preserving.
We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind.
arXiv Detail & Related papers (2021-12-21T08:44:05Z) - Uncertainty-aware Remaining Useful Life predictor [57.74855412811814]
Remaining Useful Life (RUL) estimation is the problem of inferring how long a certain industrial asset can be expected to operate.
In this work, we consider Deep Gaussian Processes (DGPs) as possible solutions to the aforementioned limitations.
The performance of the algorithms is evaluated on the N-CMAPSS dataset from NASA for aircraft engines.
arXiv Detail & Related papers (2021-04-08T08:50:44Z) - CryptoSPN: Privacy-preserving Sum-Product Network Inference [84.88362774693914]
We present a framework for privacy-preserving inference of sum-product networks (SPNs)
CryptoSPN achieves highly efficient and accurate inference in the order of seconds for medium-sized SPNs.
arXiv Detail & Related papers (2020-02-03T14:49:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.