Minimax Data Sanitization with Distortion Constraint and Adversarial Inference
- URL: http://arxiv.org/abs/2507.17942v1
- Date: Wed, 23 Jul 2025 21:22:35 GMT
- Title: Minimax Data Sanitization with Distortion Constraint and Adversarial Inference
- Authors: Amirarsalan Moatazedian, Yauhen Yakimenka, Rémi A. Chou, Jörg Kliewer,
- Abstract summary: We study a privacy-preserving data-sharing setting where a privatizer transforms private data into a sanitized version observed by an authorized reconstructor and two unauthorized adversaries.<n>We propose a data-driven training procedure that alternately updates the privatizer, reconstructor, and adversaries.
- Score: 28.511444169443195
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study a privacy-preserving data-sharing setting where a privatizer transforms private data into a sanitized version observed by an authorized reconstructor and two unauthorized adversaries, each with access to side information correlated with the private data. The reconstructor is evaluated under a distortion function, while each adversary is evaluated using a separate loss function. The privatizer ensures the reconstructor distortion remains below a fixed threshold while maximizing the minimum loss across the two adversaries. This two-adversary setting models cases where individual users cannot reconstruct the data accurately, but their combined side information enables estimation within the distortion threshold. The privatizer maximizes individual loss while permitting accurate reconstruction only through collaboration. This echoes secret-sharing principles, but with lossy rather than perfect recovery. We frame this as a constrained data-driven minimax optimization problem and propose a data-driven training procedure that alternately updates the privatizer, reconstructor, and adversaries. We also analyze the Gaussian and binary cases as special scenarios where optimal solutions can be obtained. These theoretical optimal results are benchmarks for evaluating the proposed minimax training approach.
Related papers
- Benchmarking Fraud Detectors on Private Graph Data [70.4654745317714]
Currently, many types of fraud are managed in part by automated detection algorithms that operate over graphs.<n>We consider the scenario where a data holder wishes to outsource development of fraud detectors to third parties.<n>Third parties submit their fraud detectors to the data holder, who evaluates these algorithms on a private dataset and then publicly communicates the results.<n>We propose a realistic privacy attack on this system that allows an adversary to de-anonymize individuals' data based only on the evaluation results.
arXiv Detail & Related papers (2025-07-30T03:20:15Z) - Signal Recovery from Random Dot-Product Graphs Under Local Differential Privacy [0.6906005491572401]
We consider the problem of recovering latent information from graphs under $varepsilon$-edge local differential privacy.<n>For the class of generalized random dot-product graphs, we show that a standard local differential privacy mechanism induces a specific geometric distortion in the latent positions.<n>We show that consistent recovery of the latent positions is achievable by appropriately adjusting the statistical inference procedure for the privatized graph.
arXiv Detail & Related papers (2025-04-24T06:02:02Z) - Personalized Denoising Implicit Feedback for Robust Recommender System [60.719158008403376]
We show that for a given user, there is a clear distinction between normal and noisy interactions in the user's personal loss distribution.<n>We propose a resampling strategy to Denoise using the user's Personal Loss distribution, named PLD, which reduces the probability of noisy interactions being optimized.
arXiv Detail & Related papers (2025-02-01T07:13:06Z) - Optimal Defenses Against Gradient Reconstruction Attacks [13.728704430883987]
Federated Learning (FL) is designed to prevent data leakage through collaborative model training without centralized data storage.
It remains vulnerable to gradient reconstruction attacks that recover original training data from shared gradients.
arXiv Detail & Related papers (2024-11-06T08:22:20Z) - The Data Minimization Principle in Machine Learning [61.17813282782266]
Data minimization aims to reduce the amount of data collected, processed or retained.
It has been endorsed by various global data protection regulations.
However, its practical implementation remains a challenge due to the lack of a rigorous formulation.
arXiv Detail & Related papers (2024-05-29T19:40:27Z) - Geometry-Aware Instrumental Variable Regression [56.16884466478886]
We propose a transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information.
We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings.
arXiv Detail & Related papers (2024-05-19T17:49:33Z) - On Improving the Composition Privacy Loss in Differential Privacy for Fixed Estimation Error [4.809236881780709]
We consider the private release of statistics of disjoint subsets of a dataset, where users could contribute more than one sample.<n>In particular, we focus on the $epsilon$-differentially private release of sample means and variances of sample values in disjoint subsets of a dataset.<n>Our main contribution is an iterative algorithm, based on suppressing user contributions, which seeks to reduce the overall privacy loss degradation.
arXiv Detail & Related papers (2024-05-10T06:24:35Z) - Accurate, Explainable, and Private Models: Providing Recourse While
Minimizing Training Data Leakage [10.921553888358375]
We present two novel methods to generate differentially private recourse.
We find that DPM and LR perform well in reducing what an adversary can infer.
arXiv Detail & Related papers (2023-08-08T15:38:55Z) - On the Query Complexity of Training Data Reconstruction in Private
Learning [0.0]
We analyze the number of queries that a whitebox adversary needs to make to a private learner in order to reconstruct its training data.
For $(epsilon, delta)$ DP learners with training data drawn from any arbitrary compact metric space, we provide the emphfirst known lower bounds on the adversary's query complexity.
arXiv Detail & Related papers (2023-03-29T00:49:38Z) - Algorithms for bounding contribution for histogram estimation under
user-level privacy [37.406400590568644]
We study the problem of histogram estimation under user-level differential privacy.
The goal is to preserve the privacy of all entries of any single user.
We propose algorithms to choose the best user contribution bound for histogram estimation.
arXiv Detail & Related papers (2022-06-07T04:53:24Z) - Federated Deep Learning with Bayesian Privacy [28.99404058773532]
Federated learning (FL) aims to protect data privacy by cooperatively learning a model without sharing private data among users.
Homomorphic encryption (HE) based methods provide secure privacy protections but suffer from extremely high computational and communication overheads.
Deep learning with Differential Privacy (DP) was implemented as a practical learning algorithm at a manageable cost in complexity.
arXiv Detail & Related papers (2021-09-27T12:48:40Z) - Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z) - Privacy-Preserving Federated Learning on Partitioned Attributes [6.661716208346423]
Federated learning empowers collaborative training without exposing local data or models.
We introduce an adversarial learning based procedure which tunes a local model to release privacy-preserving intermediate representations.
To alleviate the accuracy decline, we propose a defense method based on the forward-backward splitting algorithm.
arXiv Detail & Related papers (2021-04-29T14:49:14Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z) - Privacy Preserving Recalibration under Domain Shift [119.21243107946555]
We introduce a framework that abstracts out the properties of recalibration problems under differential privacy constraints.
We also design a novel recalibration algorithm, accuracy temperature scaling, that outperforms prior work on private datasets.
arXiv Detail & Related papers (2020-08-21T18:43:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.