A Game-Theoretic Approach to Privacy-Utility Tradeoff in Sharing Genomic Summary Statistics
- URL: http://arxiv.org/abs/2406.01811v1
- Date: Mon, 3 Jun 2024 22:09:47 GMT
- Title: A Game-Theoretic Approach to Privacy-Utility Tradeoff in Sharing Genomic Summary Statistics
- Authors: Tao Zhang, Rajagopal Venkatesaramani, Rajat K. De, Bradley A. Malin, Yevgeniy Vorobeychik,
- Abstract summary: We propose a game-theoretic framework for optimal privacy-utility tradeoffs in the sharing of genomic summary statistics.
Our experiments demonstrate that the proposed framework yields both stronger attacks and stronger defense strategies than the state of the art.
- Score: 24.330984323956173
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advent of online genomic data-sharing services has sought to enhance the accessibility of large genomic datasets by allowing queries about genetic variants, such as summary statistics, aiding care providers in distinguishing between spurious genomic variations and those with clinical significance. However, numerous studies have demonstrated that even sharing summary genomic information exposes individual members of such datasets to a significant privacy risk due to membership inference attacks. While several approaches have emerged that reduce privacy risks by adding noise or reducing the amount of information shared, these typically assume non-adaptive attacks that use likelihood ratio test (LRT) statistics. We propose a Bayesian game-theoretic framework for optimal privacy-utility tradeoff in the sharing of genomic summary statistics. Our first contribution is to prove that a very general Bayesian attacker model that anchors our game-theoretic approach is more powerful than the conventional LRT-based threat models in that it induces worse privacy loss for the defender who is modeled as a von Neumann-Morgenstern (vNM) decision-maker. We show this to be true even when the attacker uses a non-informative subjective prior. Next, we present an analytically tractable approach to compare the Bayesian attacks with arbitrary subjective priors and the Neyman-Pearson optimal LRT attacks under the Gaussian mechanism common in differential privacy frameworks. Finally, we propose an approach for approximating Bayes-Nash equilibria of the game using deep neural network generators to implicitly represent player mixed strategies. Our experiments demonstrate that the proposed game-theoretic framework yields both stronger attacks and stronger defense strategies than the state of the art.
Related papers
- Transferable Adversarial Attacks on SAM and Its Downstream Models [87.23908485521439]
This paper explores the feasibility of adversarial attacking various downstream models fine-tuned from the segment anything model (SAM)
To enhance the effectiveness of the adversarial attack towards models fine-tuned on unknown datasets, we propose a universal meta-initialization (UMI) algorithm.
arXiv Detail & Related papers (2024-10-26T15:04:04Z) - Bayes-Nash Generative Privacy Protection Against Membership Inference Attacks [24.330984323956173]
We propose a game model for privacy-preserving publishing of data-sharing mechanism outputs.
We introduce the notions of Bayes-Nash generative privacy (BNGP) and Bayes generative privacy (BGP) risk.
We apply our method to sharing summary statistics, where MIAs can re-identify individuals even from aggregated data.
arXiv Detail & Related papers (2024-10-09T20:29:04Z) - Sequential Manipulation Against Rank Aggregation: Theory and Algorithm [119.57122943187086]
We leverage an online attack on the vulnerable data collection process.
From the game-theoretic perspective, the confrontation scenario is formulated as a distributionally robust game.
The proposed method manipulates the results of rank aggregation methods in a sequential manner.
arXiv Detail & Related papers (2024-07-02T03:31:21Z) - ATTAXONOMY: Unpacking Differential Privacy Guarantees Against Practical Adversaries [11.550822252074733]
We offer a detailed taxonomy of attacks, showing the various dimensions of attacks and highlighting that many real-world settings have been understudied.
We operationalize our taxonomy by using it to analyze a real-world case study, the Israeli Ministry of Health's recent release of a birth dataset using Differential Privacy.
arXiv Detail & Related papers (2024-05-02T20:23:23Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Enabling Trade-offs in Privacy and Utility in Genomic Data Beacons and
Summary Statistics [26.99521354120141]
We introduce optimization-based approaches to explicitly trade off the utility of summary data or Beacon responses and privacy.
In the first, an attacker applies a likelihood-ratio test to make membership-inference claims.
In the second, an attacker uses a threshold that accounts for the effect of the data release on the separation in scores between individuals.
arXiv Detail & Related papers (2023-01-11T19:16:13Z) - Improved Generalization Guarantees in Restricted Data Models [16.193776814471768]
Differential privacy is known to protect against threats to validity incurred due to adaptive, or exploratory, data analysis.
We show that, under this assumption, it is possible to "re-use" privacy budget on different portions of the data, significantly improving accuracy without increasing the risk of overfitting.
arXiv Detail & Related papers (2022-07-20T16:04:12Z) - Curse or Redemption? How Data Heterogeneity Affects the Robustness of
Federated Learning [51.15273664903583]
Data heterogeneity has been identified as one of the key features in federated learning but often overlooked in the lens of robustness to adversarial attacks.
This paper focuses on characterizing and understanding its impact on backdooring attacks in federated learning through comprehensive experiments using synthetic and the LEAF benchmarks.
arXiv Detail & Related papers (2021-02-01T06:06:21Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z) - Systematic Evaluation of Privacy Risks of Machine Learning Models [41.017707772150835]
We show that prior work on membership inference attacks may severely underestimate the privacy risks.
We first propose to benchmark membership inference privacy risks by improving existing non-neural network based inference attacks.
We then introduce a new approach for fine-grained privacy analysis by formulating and deriving a new metric called the privacy risk score.
arXiv Detail & Related papers (2020-03-24T00:53:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.