Secure Metric Learning via Differential Pairwise Privacy
- URL: http://arxiv.org/abs/2003.13413v1
- Date: Mon, 30 Mar 2020 12:47:48 GMT
- Title: Secure Metric Learning via Differential Pairwise Privacy
- Authors: Jing Li, Yuangang Pan, Yulei Sui, and Ivor W. Tsang
- Abstract summary: This paper studies, for the first time, how pairwise information can be leaked to attackers during distance metric learning.
We develop differential pairwise privacy (DPP), generalizing the definition of standard differential privacy, for secure metric learning.
- Score: 36.946123614592054
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Distance Metric Learning (DML) has drawn much attention over the last two
decades. A number of previous works have shown that it performs well in
measuring the similarities of individuals given a set of correctly labeled
pairwise data by domain experts. These important and precisely-labeled pairwise
data are often highly sensitive in real world (e.g., patients similarity). This
paper studies, for the first time, how pairwise information can be leaked to
attackers during distance metric learning, and develops differential pairwise
privacy (DPP), generalizing the definition of standard differential privacy,
for secure metric learning. Unlike traditional differential privacy which only
applies to independent samples, thus cannot be used for pairwise data, DPP
successfully deals with this problem by reformulating the worst case.
Specifically, given the pairwise data, we reveal all the involved correlations
among pairs in the constructed undirected graph. DPP is then formalized that
defines what kind of DML algorithm is private to preserve pairwise data. After
that, a case study employing the contrastive loss is exhibited to clarify the
details of implementing a DPP-DML algorithm. Particularly, the sensitivity
reduction technique is proposed to enhance the utility of the output distance
metric. Experiments both on a toy dataset and benchmarks demonstrate that the
proposed scheme achieves pairwise data privacy without compromising the output
performance much (Accuracy declines less than 0.01 throughout all benchmark
datasets when the privacy budget is set at 4).
Related papers
- Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines.
We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z) - Noise Variance Optimization in Differential Privacy: A Game-Theoretic Approach Through Per-Instance Differential Privacy [7.264378254137811]
Differential privacy (DP) can measure privacy loss by observing the changes in the distribution caused by the inclusion of individuals in the target dataset.
DP has been prominent in safeguarding datasets in machine learning in industry giants like Apple and Google.
We propose per-instance DP (pDP) as a constraint, measuring privacy loss for each data instance and optimizing noise tailored to individual instances.
arXiv Detail & Related papers (2024-04-24T06:51:16Z) - Differentially Private Linear Regression with Linked Data [3.9325957466009203]
Differential privacy, a mathematical notion from computer science, is a rising tool offering robust privacy guarantees.
Recent work focuses on developing differentially private versions of individual statistical and machine learning tasks.
We present two differentially private algorithms for linear regression with linked data.
arXiv Detail & Related papers (2023-08-01T21:00:19Z) - Mean Estimation with User-level Privacy under Data Heterogeneity [54.07947274508013]
Different users may possess vastly different numbers of data points.
It cannot be assumed that all users sample from the same underlying distribution.
We propose a simple model of heterogeneous user data that allows user data to differ in both distribution and quantity of data.
arXiv Detail & Related papers (2023-07-28T23:02:39Z) - Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD [44.11069254181353]
We show that DP-SGD leaks significantly less privacy for many datapoints when trained on common benchmarks.
This implies privacy attacks will necessarily fail against many datapoints if the adversary does not have sufficient control over the possible training datasets.
arXiv Detail & Related papers (2023-07-01T11:51:56Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Production of Categorical Data Verifying Differential Privacy:
Conception and Applications to Machine Learning [0.0]
Differential privacy is a formal definition that allows quantifying the privacy-utility trade-off.
With the local DP (LDP) model, users can sanitize their data locally before transmitting it to the server.
In all cases, we concluded that differentially private ML models achieve nearly the same utility metrics as non-private ones.
arXiv Detail & Related papers (2022-04-02T12:50:14Z) - Privacy-Preserving Distributed Learning in the Analog Domain [23.67685616088422]
We consider the problem of distributed learning over data while keeping it private from the computational servers.
We propose a novel algorithm to solve the problem when data is in the analog domain.
We show how the proposed framework can be adopted to do computation tasks when data is represented using floating-point numbers.
arXiv Detail & Related papers (2020-07-17T07:56:39Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z) - User-Level Privacy-Preserving Federated Learning: Analysis and
Performance Optimization [77.43075255745389]
Federated learning (FL) is capable of preserving private data from mobile terminals (MTs) while training the data into useful models.
From a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs.
We propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers.
arXiv Detail & Related papers (2020-02-29T10:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.