Striking a Balance: An Optimal Mechanism Design for Heterogenous
Differentially Private Data Acquisition for Logistic Regression
- URL: http://arxiv.org/abs/2309.10340v1
- Date: Tue, 19 Sep 2023 05:51:13 GMT
- Title: Striking a Balance: An Optimal Mechanism Design for Heterogenous
Differentially Private Data Acquisition for Logistic Regression
- Authors: Ameya Anjarlekar, Rasoul Etesami, R. Srikant
- Abstract summary: We investigate the problem of performing logistic regression on data collected from privacy-sensitive sellers.
Since the data is private, sellers must be incentivized through payments to provide their data.
- Score: 8.45602005745865
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We investigate the problem of performing logistic regression on data
collected from privacy-sensitive sellers. Since the data is private, sellers
must be incentivized through payments to provide their data. Thus, the goal is
to design a mechanism that optimizes a weighted combination of test loss,
seller privacy, and payment, i.e., strikes a balance between multiple
objectives of interest. We solve the problem by combining ideas from game
theory, statistical learning theory, and differential privacy. The buyer's
objective function can be highly non-convex. However, we show that, under
certain conditions on the problem parameters, the problem can be convexified by
using a change of variables. We also provide asymptotic results characterizing
the buyer's test error and payments when the number of sellers becomes large.
Finally, we demonstrate our ideas by applying them to a real healthcare data
set.
Related papers
- Differentially Private Linear Regression with Linked Data [3.9325957466009203]
Differential privacy, a mathematical notion from computer science, is a rising tool offering robust privacy guarantees.
Recent work focuses on developing differentially private versions of individual statistical and machine learning tasks.
We present two differentially private algorithms for linear regression with linked data.
arXiv Detail & Related papers (2023-08-01T21:00:19Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Mechanisms that Incentivize Data Sharing in Federated Learning [90.74337749137432]
We show how a naive scheme leads to catastrophic levels of free-riding where the benefits of data sharing are completely eroded.
We then introduce accuracy shaping based mechanisms to maximize the amount of data generated by each agent.
arXiv Detail & Related papers (2022-07-10T22:36:52Z) - Differentially Private Multi-Party Data Release for Linear Regression [40.66319371232736]
Differentially Private (DP) data release is a promising technique to disseminate data without compromising the privacy of data subjects.
In this paper we focus on the multi-party setting, where different stakeholders own disjoint sets of attributes belonging to the same group of data subjects.
We propose our novel method and prove it converges to the optimal (non-private) solutions with increasing dataset size.
arXiv Detail & Related papers (2022-06-16T08:32:17Z) - Towards Explainable Metaheuristic: Mining Surrogate Fitness Models for
Importance of Variables [69.02115180674885]
We use four benchmark problems to train a surrogate model and investigate the learning of the search space by the surrogate model.
We show that the surrogate model picks out key characteristics of the problem as it is trained on population data from each generation.
arXiv Detail & Related papers (2022-05-31T09:16:18Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - Competition over data: how does data purchase affect users? [15.644822986029377]
We study what happens when the competing predictors can acquire additional labeled data to improve their prediction quality.
We show that this phenomenon naturally arises due to a trade-off whereby competition pushes each predictor to specialize in a subset of the population.
arXiv Detail & Related papers (2022-01-26T06:44:55Z) - Data Sharing Markets [95.13209326119153]
We study a setup where each agent can be both buyer and seller of data.
We consider two cases: bilateral data exchange (trading data with data) and unilateral data exchange (trading data with money)
arXiv Detail & Related papers (2021-07-19T06:00:34Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.