Learning across Data Owners with Joint Differential Privacy
- URL: http://arxiv.org/abs/2305.15723v1
- Date: Thu, 25 May 2023 05:11:40 GMT
- Title: Learning across Data Owners with Joint Differential Privacy
- Authors: Yangsibo Huang, Haotian Jiang, Daogao Liu, Mohammad Mahdian, Jieming
Mao, Vahab Mirrokni
- Abstract summary: We study the setting in which data owners train machine learning models collaboratively under a privacy notion called joint differential privacy.
In this setting, the model trained for each data owner $j$ uses $j$'s data without privacy consideration and other owners' data with differential privacy guarantees.
We present an algorithm that is a variant of DP-SGD and provides theoretical bounds on its population loss.
- Score: 13.531808240117645
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the setting in which data owners train machine
learning models collaboratively under a privacy notion called joint
differential privacy [Kearns et al., 2018]. In this setting, the model trained
for each data owner $j$ uses $j$'s data without privacy consideration and other
owners' data with differential privacy guarantees. This setting was initiated
in [Jain et al., 2021] with a focus on linear regressions. In this paper, we
study this setting for stochastic convex optimization (SCO). We present an
algorithm that is a variant of DP-SGD [Song et al., 2013; Abadi et al., 2016]
and provides theoretical bounds on its population loss. We compare our
algorithm to several baselines and discuss for what parameter setups our
algorithm is more preferred. We also empirically study joint differential
privacy in the multi-class classification problem over two public datasets. Our
empirical findings are well-connected to the insights from our theoretical
results.
Related papers
- Federated Transfer Learning with Differential Privacy [21.50525027559563]
We formulate the notion of textitfederated differential privacy, which offers privacy guarantees for each data set without assuming a trusted central server.
We show that federated differential privacy is an intermediate privacy model between the well-established local and central models of differential privacy.
arXiv Detail & Related papers (2024-03-17T21:04:48Z) - Share Your Representation Only: Guaranteed Improvement of the
Privacy-Utility Tradeoff in Federated Learning [47.042811490685324]
Mitigating the risk of this information leakage, using state of the art differentially private algorithms, also does not come for free.
In this paper, we consider a representation learning objective that various parties collaboratively refine on a federated model, with differential privacy guarantees.
We observe a significant performance improvement over the prior work under the same small privacy budget.
arXiv Detail & Related papers (2023-09-11T14:46:55Z) - Personalized Graph Federated Learning with Differential Privacy [6.282767337715445]
This paper presents a personalized graph federated learning (PGFL) framework in which distributedly connected servers and their respective edge devices collaboratively learn device or cluster-specific models.
We study a variant of the PGFL implementation that utilizes differential privacy, specifically zero-concentrated differential privacy, where a noise sequence perturbs model exchanges.
Our analysis shows that the algorithm ensures local differential privacy for all clients in terms of zero-concentrated differential privacy.
arXiv Detail & Related papers (2023-06-10T09:52:01Z) - Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis
Testing: A Lesson From Fano [83.5933307263932]
We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing.
We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
arXiv Detail & Related papers (2022-10-24T23:50:12Z) - Algorithms with More Granular Differential Privacy Guarantees [65.3684804101664]
We consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis.
In this work, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person.
arXiv Detail & Related papers (2022-09-08T22:43:50Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - Differentially Private Multi-Party Data Release for Linear Regression [40.66319371232736]
Differentially Private (DP) data release is a promising technique to disseminate data without compromising the privacy of data subjects.
In this paper we focus on the multi-party setting, where different stakeholders own disjoint sets of attributes belonging to the same group of data subjects.
We propose our novel method and prove it converges to the optimal (non-private) solutions with increasing dataset size.
arXiv Detail & Related papers (2022-06-16T08:32:17Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Mixed Differential Privacy in Computer Vision [133.68363478737058]
AdaMix is an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data.
A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset.
arXiv Detail & Related papers (2022-03-22T06:15:43Z) - Personalization Improves Privacy-Accuracy Tradeoffs in Federated
Optimization [57.98426940386627]
We show that coordinating local learning with private centralized learning yields a generically useful and improved tradeoff between accuracy and privacy.
We illustrate our theoretical results with experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2022-02-10T20:44:44Z) - Differentially Private Ensemble Classifiers for Data Streams [3.9838304163788183]
Adapting to evolving data characteristics (concept drift) while protecting data owners' private information is an open challenge.
We present a differentially private ensemble solution to this problem with two distinguishing features.
It allows an textitunbounded number of ensemble updates to deal with the potentially never-ending data streams.
It is textitmodel agnostic, in that it treats any pre-trained differentially private classification/regression model as a black-box.
arXiv Detail & Related papers (2021-12-09T00:55:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.