Differentially Private Ensemble Classifiers for Data Streams
- URL: http://arxiv.org/abs/2112.04640v1
- Date: Thu, 9 Dec 2021 00:55:04 GMT
- Title: Differentially Private Ensemble Classifiers for Data Streams
- Authors: Lovedeep Gondara, Ke Wang, Ricardo Silva Carvalho
- Abstract summary: Adapting to evolving data characteristics (concept drift) while protecting data owners' private information is an open challenge.
We present a differentially private ensemble solution to this problem with two distinguishing features.
It allows an textitunbounded number of ensemble updates to deal with the potentially never-ending data streams.
It is textitmodel agnostic, in that it treats any pre-trained differentially private classification/regression model as a black-box.
- Score: 3.9838304163788183
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning from continuous data streams via classification/regression is
prevalent in many domains. Adapting to evolving data characteristics (concept
drift) while protecting data owners' private information is an open challenge.
We present a differentially private ensemble solution to this problem with two
distinguishing features: it allows an \textit{unbounded} number of ensemble
updates to deal with the potentially never-ending data streams under a fixed
privacy budget, and it is \textit{model agnostic}, in that it treats any
pre-trained differentially private classification/regression model as a
black-box. Our method outperforms competitors on real-world and simulated
datasets for varying settings of privacy, concept drift, and data distribution.
Related papers
- Federated Transfer Learning with Differential Privacy [21.50525027559563]
We formulate the notion of textitfederated differential privacy, which offers privacy guarantees for each data set without assuming a trusted central server.
We show that federated differential privacy is an intermediate privacy model between the well-established local and central models of differential privacy.
arXiv Detail & Related papers (2024-03-17T21:04:48Z) - Federated Learning Empowered by Generative Content [55.576885852501775]
Federated learning (FL) enables leveraging distributed private data for model training in a privacy-preserving way.
We propose a novel FL framework termed FedGC, designed to mitigate data heterogeneity issues by diversifying private data with generative content.
We conduct a systematic empirical study on FedGC, covering diverse baselines, datasets, scenarios, and modalities.
arXiv Detail & Related papers (2023-12-10T07:38:56Z) - Mean Estimation with User-level Privacy under Data Heterogeneity [54.07947274508013]
Different users may possess vastly different numbers of data points.
It cannot be assumed that all users sample from the same underlying distribution.
We propose a simple model of heterogeneous user data that allows user data to differ in both distribution and quantity of data.
arXiv Detail & Related papers (2023-07-28T23:02:39Z) - Data Analytics with Differential Privacy [0.0]
We develop differentially private algorithms to analyze distributed and streaming data.
In the distributed model, we consider the particular problem of learning -- in a distributed fashion -- a global model of the data.
We offer one of the strongest privacy guarantees for the streaming model, user-level pan-privacy.
arXiv Detail & Related papers (2023-07-20T17:43:29Z) - Learning across Data Owners with Joint Differential Privacy [13.531808240117645]
We study the setting in which data owners train machine learning models collaboratively under a privacy notion called joint differential privacy.
In this setting, the model trained for each data owner $j$ uses $j$'s data without privacy consideration and other owners' data with differential privacy guarantees.
We present an algorithm that is a variant of DP-SGD and provides theoretical bounds on its population loss.
arXiv Detail & Related papers (2023-05-25T05:11:40Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - Differentially Private Multi-Party Data Release for Linear Regression [40.66319371232736]
Differentially Private (DP) data release is a promising technique to disseminate data without compromising the privacy of data subjects.
In this paper we focus on the multi-party setting, where different stakeholders own disjoint sets of attributes belonging to the same group of data subjects.
We propose our novel method and prove it converges to the optimal (non-private) solutions with increasing dataset size.
arXiv Detail & Related papers (2022-06-16T08:32:17Z) - Mixed Differential Privacy in Computer Vision [133.68363478737058]
AdaMix is an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data.
A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset.
arXiv Detail & Related papers (2022-03-22T06:15:43Z) - Don't Generate Me: Training Differentially Private Generative Models
with Sinkhorn Divergence [73.14373832423156]
We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy.
Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
arXiv Detail & Related papers (2021-11-01T18:10:21Z) - Generating private data with user customization [9.415164800448853]
Mobile devices can produce and store large amounts of data that can enhance machine learning models.
However, this data may contain private information specific to the data owner that prevents the release of the data.
We want to reduce the correlation between user-specific private information and the data while retaining the useful information.
arXiv Detail & Related papers (2020-12-02T19:13:58Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.