Active Privacy-utility Trade-off Against a Hypothesis Testing Adversary
- URL: http://arxiv.org/abs/2102.08308v2
- Date: Thu, 18 Feb 2021 11:59:00 GMT
- Title: Active Privacy-utility Trade-off Against a Hypothesis Testing Adversary
- Authors: Ecenaz Erdemir and Pier Luigi Dragotti and Deniz Gunduz
- Abstract summary: We consider a user releasing her data containing some personal information in return of a service.
We model user's personal information as two correlated random variables, one of them, called the secret variable, is to be kept private.
For the utility, we consider both the probability of correct detection of the useful variable and the mutual information (MI) between the useful variable and released data.
- Score: 34.6578234382717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a user releasing her data containing some personal information in
return of a service. We model user's personal information as two correlated
random variables, one of them, called the secret variable, is to be kept
private, while the other, called the useful variable, is to be disclosed for
utility. We consider active sequential data release, where at each time step
the user chooses from among a finite set of release mechanisms, each revealing
some information about the user's personal information, i.e., the true
hypotheses, albeit with different statistics. The user manages data release in
an online fashion such that maximum amount of information is revealed about the
latent useful variable, while the confidence for the sensitive variable is kept
below a predefined level. For the utility, we consider both the probability of
correct detection of the useful variable and the mutual information (MI)
between the useful variable and released data. We formulate both problems as a
Markov decision process (MDP), and numerically solve them by advantage
actor-critic (A2C) deep reinforcement learning (RL).
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Differentially Private Linear Regression with Linked Data [3.9325957466009203]
Differential privacy, a mathematical notion from computer science, is a rising tool offering robust privacy guarantees.
Recent work focuses on developing differentially private versions of individual statistical and machine learning tasks.
We present two differentially private algorithms for linear regression with linked data.
arXiv Detail & Related papers (2023-08-01T21:00:19Z) - Mean Estimation with User-level Privacy under Data Heterogeneity [54.07947274508013]
Different users may possess vastly different numbers of data points.
It cannot be assumed that all users sample from the same underlying distribution.
We propose a simple model of heterogeneous user data that allows user data to differ in both distribution and quantity of data.
arXiv Detail & Related papers (2023-07-28T23:02:39Z) - Enabling Trade-offs in Privacy and Utility in Genomic Data Beacons and
Summary Statistics [26.99521354120141]
We introduce optimization-based approaches to explicitly trade off the utility of summary data or Beacon responses and privacy.
In the first, an attacker applies a likelihood-ratio test to make membership-inference claims.
In the second, an attacker uses a threshold that accounts for the effect of the data release on the separation in scores between individuals.
arXiv Detail & Related papers (2023-01-11T19:16:13Z) - Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis
Testing: A Lesson From Fano [83.5933307263932]
We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing.
We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
arXiv Detail & Related papers (2022-10-24T23:50:12Z) - Gacs-Korner Common Information Variational Autoencoder [102.89011295243334]
We propose a notion of common information that allows one to quantify and separate the information that is shared between two random variables.
We demonstrate that our formulation allows us to learn semantically meaningful common and unique factors of variation even on high-dimensional data such as images and videos.
arXiv Detail & Related papers (2022-05-24T17:47:26Z) - Production of Categorical Data Verifying Differential Privacy:
Conception and Applications to Machine Learning [0.0]
Differential privacy is a formal definition that allows quantifying the privacy-utility trade-off.
With the local DP (LDP) model, users can sanitize their data locally before transmitting it to the server.
In all cases, we concluded that differentially private ML models achieve nearly the same utility metrics as non-private ones.
arXiv Detail & Related papers (2022-04-02T12:50:14Z) - Active Privacy-Utility Trade-off Against Inference in Time-Series Data
Sharing [29.738666406095074]
We consider a user releasing her data containing personal information in return of a service from an honest-but-curious service provider (SP)
We formulate both problems as partially observable Markov decision processes (POMDPs) and numerically solve them by advantage actor-critic (A2C) deep reinforcement learning (DRL)
We evaluate the privacy-utility trade-off (PUT) of the proposed policies on both the synthetic data and smoking activity dataset, and show their validity by testing the activity detection accuracy of the SP modeled by a long short-term memory (LSTM) neural network.
arXiv Detail & Related papers (2022-02-11T18:57:31Z) - A Bayesian Framework for Information-Theoretic Probing [51.98576673620385]
We argue that probing should be seen as approximating a mutual information.
This led to the rather unintuitive conclusion that representations encode exactly the same information about a target task as the original sentences.
This paper proposes a new framework to measure what we term Bayesian mutual information.
arXiv Detail & Related papers (2021-09-08T18:08:36Z) - Deep Directed Information-Based Learning for Privacy-Preserving Smart
Meter Data Release [30.409342804445306]
We study the problem in the context of time series data and smart meters (SMs) power consumption measurements.
We introduce the Directed Information (DI) as a more meaningful measure of privacy in the considered setting.
Our empirical studies on real-world data sets from SMs measurements in the worst-case scenario show the existing trade-offs between privacy and utility.
arXiv Detail & Related papers (2020-11-20T13:41:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.