Related papers: Differentially Private Synthetic Data with Private Density Estimation

Differentially Private Synthetic Data with Private Density Estimation

URL: http://arxiv.org/abs/2405.04554v1
Date: Mon, 6 May 2024 14:06:12 GMT
Title: Differentially Private Synthetic Data with Private Density Estimation
Authors: Nikolija Bojkovic, Po-Ling Loh,
Abstract summary: We adopt the framework of differential privacy, and explore mechanisms for generating an entire dataset. We build upon the work of Boedihardjo et al, which laid the foundations for a new optimization-based algorithm for generating private synthetic data.
Score: 2.209921757303168
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The need to analyze sensitive data, such as medical records or financial data, has created a critical research challenge in recent years. In this paper, we adopt the framework of differential privacy, and explore mechanisms for generating an entire dataset which accurately captures characteristics of the original data. We build upon the work of Boedihardjo et al, which laid the foundations for a new optimization-based algorithm for generating private synthetic data. Importantly, we adapt their algorithm by replacing a uniform sampling step with a private distribution estimator; this allows us to obtain better computational guarantees for discrete distributions, and develop a novel algorithm suitable for continuous distributions. We also explore applications of our work to several statistical tasks.

Related papers

Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications. Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility. We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z)
Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines. We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z)
Hierarchical Bayes Approach to Personalized Federated Unsupervised Learning [7.8583640700306585]
We develop algorithms based on optimization criteria inspired by a hierarchical Bayesian statistical framework. We develop adaptive algorithms that discover the balance between using limited local data and collaborative information. We evaluate our proposed algorithms using synthetic and real data, demonstrating the effective sample amplification for personalized tasks.
arXiv Detail & Related papers (2024-02-19T20:53:27Z)
Data Analytics with Differential Privacy [0.0]
We develop differentially private algorithms to analyze distributed and streaming data. In the distributed model, we consider the particular problem of learning -- in a distributed fashion -- a global model of the data. We offer one of the strongest privacy guarantees for the streaming model, user-level pan-privacy.
arXiv Detail & Related papers (2023-07-20T17:43:29Z)
Differentially Private Synthetic Data Using KD-Trees [11.96971298978997]
We exploit space partitioning techniques together with noise perturbation and thus achieve intuitive and transparent algorithms. We propose both data independent and data dependent algorithms for $epsilon$-differentially private synthetic data generation. We show empirical utility improvements over the prior work, and discuss performance of our algorithm on a downstream classification task on a real dataset.
arXiv Detail & Related papers (2023-06-19T17:08:32Z)
Regression with Label Differential Privacy [64.21020761920322]
We derive a label DP randomization mechanism that is optimal under a given regression loss function. We prove that the optimal mechanism takes the form of a "randomized response on bins"
arXiv Detail & Related papers (2022-12-12T17:41:32Z)
Learning to Bound Counterfactual Inference in Structural Causal Models from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm. The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources. It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z)
Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge. Existing private generative models are struggling with the utility of synthetic samples. We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z)
Private Domain Adaptation from a Public Source [48.83724068578305]
We design differentially private discrepancy-based algorithms for adaptation from a source domain with public labeled data to a target domain with unlabeled private data. Our solutions are based on private variants of Frank-Wolfe and Mirror-Descent algorithms.
arXiv Detail & Related papers (2022-08-12T06:52:55Z)
Sensitivity analysis in differentially private machine learning using hybrid automatic differentiation [54.88777449903538]
We introduce a novel textithybrid automatic differentiation (AD) system for sensitivity analysis. This enables modelling the sensitivity of arbitrary differentiable function compositions, such as the training of neural networks on private data. Our approach can enable the principled reasoning about privacy loss in the setting of data processing.
arXiv Detail & Related papers (2021-07-09T07:19:23Z)
Differentially Private Simple Linear Regression [2.614403183902121]
We study algorithms for simple linear regression that satisfy differential privacy. We consider the design of differentially private algorithms for simple linear regression for small datasets. We study the performance of a spectrum of algorithms we adapt to the setting.
arXiv Detail & Related papers (2020-07-10T04:28:43Z)
Learning while Respecting Privacy and Robustness to Distributional Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model. The objective is to endow the trained model with robustness against adversarially manipulated input data. Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.