Privacy-Preserving Boosting in the Local Setting
- URL: http://arxiv.org/abs/2002.02096v1
- Date: Thu, 6 Feb 2020 04:48:51 GMT
- Title: Privacy-Preserving Boosting in the Local Setting
- Authors: Sen Wang, J.Morris Chang
- Abstract summary: In machine learning, boosting is one of the most popular methods that designed to combine multiple base learners to a superior one.
In the big data era, the data held by individual and entities, like personal images, browsing history and census information, are more likely to contain sensitive information.
Local Differential Privacy is proposed as an effective privacy protection approach, which offers a strong guarantee to the data owners.
- Score: 17.375582978294105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In machine learning, boosting is one of the most popular methods that
designed to combine multiple base learners to a superior one. The well-known
Boosted Decision Tree classifier, has been widely adopted in many areas. In the
big data era, the data held by individual and entities, like personal images,
browsing history and census information, are more likely to contain sensitive
information. The privacy concern raises when such data leaves the hand of the
owners and be further explored or mined. Such privacy issue demands that the
machine learning algorithm should be privacy aware. Recently, Local
Differential Privacy is proposed as an effective privacy protection approach,
which offers a strong guarantee to the data owners, as the data is perturbed
before any further usage, and the true values never leave the hands of the
owners. Thus the machine learning algorithm with the private data instances is
of great value and importance. In this paper, we are interested in developing
the privacy-preserving boosting algorithm that a data user is allowed to build
a classifier without knowing or deriving the exact value of each data samples.
Our experiments demonstrate the effectiveness of the proposed boosting
algorithm and the high utility of the learned classifiers.
Related papers
- FT-PrivacyScore: Personalized Privacy Scoring Service for Machine Learning Participation [4.772368796656325]
In practice, controlled data access remains a mainstream method for protecting data privacy in many industrial and research environments.
We developed the demo prototype FT-PrivacyScore to show that it's possible to efficiently and quantitatively estimate the privacy risk of participating in a model fine-tuning task.
arXiv Detail & Related papers (2024-10-30T02:41:26Z) - Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied.
Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - Privacy Preserving Large Language Models: ChatGPT Case Study Based Vision and Framework [6.828884629694705]
This article proposes the conceptual model called PrivChatGPT, a privacy-generative model for LLMs.
PrivChatGPT consists of two main components i.e., preserving user privacy during the data curation/pre-processing together with preserving private context and the private training process for large-scale data.
arXiv Detail & Related papers (2023-10-19T06:55:13Z) - Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining [75.25943383604266]
We question whether the use of large Web-scraped datasets should be viewed as differential-privacy-preserving.
We caution that publicizing these models pretrained on Web data as "private" could lead to harm and erode the public's trust in differential privacy as a meaningful definition of privacy.
We conclude by discussing potential paths forward for the field of private learning, as public pretraining becomes more popular and powerful.
arXiv Detail & Related papers (2022-12-13T10:41:12Z) - Algorithms with More Granular Differential Privacy Guarantees [65.3684804101664]
We consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis.
In this work, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person.
arXiv Detail & Related papers (2022-09-08T22:43:50Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Mixed Differential Privacy in Computer Vision [133.68363478737058]
AdaMix is an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data.
A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset.
arXiv Detail & Related papers (2022-03-22T06:15:43Z) - Private Boosted Decision Trees via Smooth Re-Weighting [2.099922236065961]
Differential Privacy is the appropriate mathematical framework for formal guarantees of privacy.
We propose and test a practical algorithm for boosting decision trees that guarantees differential privacy.
arXiv Detail & Related papers (2022-01-29T20:08:52Z) - TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework
for Deep Learning with Anonymized Intermediate Representations [49.20701800683092]
We present TIPRDC, a task-independent privacy-respecting data crowdsourcing framework with anonymized intermediate representation.
The goal of this framework is to learn a feature extractor that can hide the privacy information from the intermediate representations; while maximally retaining the original information embedded in the raw data for the data collector to accomplish unknown learning tasks.
arXiv Detail & Related papers (2020-05-23T06:21:26Z) - Utility-aware Privacy-preserving Data Releasing [7.462336024223669]
We propose a two-step perturbation-based privacy-preserving data releasing framework.
First, certain predefined privacy and utility problems are learned from the public domain data.
We then leverage the learned knowledge to precisely perturb the data owners' data into privatized data.
arXiv Detail & Related papers (2020-05-09T05:32:46Z) - Privacy-Preserving Public Release of Datasets for Support Vector Machine
Classification [14.095523601311374]
We consider the problem of publicly releasing a dataset for support vector machine classification while not infringing on the privacy of data subjects.
The dataset is systematically obfuscated using an additive noise for privacy protection.
Conditions are established for ensuring that the classifier extracted from the original dataset and the obfuscated one are close to each other.
arXiv Detail & Related papers (2019-12-29T03:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.